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PREFACE 

ITae idea of writing an overview and assessment of the research 
literatixre on the comprehension of meaningful verbal discourse in 
educational media originated in I 965 with a Study Panel, of which I 
was a member, establislied under Title 7 of the National Defense 
Education Act of 1958- The members of the Study Panel felt that such 
a review woiild be usef-ul to educational planners and policy-makers, 
researchers, and designers of instructional materials. I was persuaded 
to undertake this review, but at the time, neither I noi’ the other 
members of the panel had a realistic idea of the dimensions of the task. 
That the literature of this field is so enomous and that relevant 
work is going on in such a wide variety of domains is in itself a 
finding that justifies the assignment. 

The time, staff, and budget requested for this project was grossly 
underestimated. Extensive as this report and the accompanying bibliography 
is, circumstances have forced me to compromise my standards, and I 
would be the first to admit that the report is somewhat superficial 
at marry points. The bibliographical search could have been expanded 
in many directions, and there could have been a more thorough examination 
and critique of the literature in certain eireas. Major emphasis has 
been placed on literature produced in the period 1961-1970, but much 
selectivity had to be exercised because of the large amount of material 
available. Undoubtedly I have missed a number of important items. 

It is my hope that this document will to some extent serve the 
function that is the objective of any survey of this kind — to organize 
the present state of our knowledge into a framework such that duplication 
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and redundancy in research will be reduced, ongoing research can be 
facilitated, and contemplated research can take note of the gaps and 
neglected areas that have become apparent in the process of mapping 
the terrain. 

Miss Mary Harcar, a Research Assistant at Educational Testing 
Service during I968-I969, was of much help in the early phases of 
assembling the bibliography. I am gratefiiL to her, as well as to the 
typing, clerical, and editing personnel at ETS who assisted inputting 
the report together. 



John B. Carroll 
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SUMMARY 

This review, based on a siarvey of more than 1200 items in the 
research literature, begins by attempting to outline a theory of language 
comprehension and learning from language. A lengthy chapter is devoted 
to problems in the measurement of comprehension and of learning from 
connected discourse. It then considers, in successive chapters, the 
role of various kinds of factors in promoting comprehension and learning 
from connected discourse: stlmiilus characteristics such as readability, 
listenability, vocabxilary, grammatical struct\rre, and logical organization; 
stimulus modality (audition vs. vision); manner of presentation; 
factors in learning and memory; and individual differences. Problems 
for further research are pointed out. 
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INTRODUCTION MD SCOPE 



Even in various educational media such as films, television, and pro- 
grammed instruction, by far the largest amount of teaching activity involves 
"telling things" to students, whether by speech or the printed word. A 
picture is usually meaningless without a caption, and most educational films 
would be only minimally intelligible without sound track or titles. In 
instructional television, it is common practice for the lecturer to perform 
as if he were in a classroom. Programmed instruction makes liberal use of 
verbal messages. It seems obvious that meaningful verbal discourse (MVD) is 
the primary tool of teaching. We expect students to learn most things by 
being told about them. 

1 

It is the purpose of this review to bring together, and to interpret, 
for their possible utili;.ty in the preparation and use of educational media, 
available research find;ings concerning how pupils imderstand, learn, and 
remember the content oi' MVD. The review will also identify S^ps i^i the 

t 

research literature a'’|d point out problems for further research. 

The scope of the review can perhaps best be indicated by starting from 
what Schlesinger (1966b, p. 227) calls a "faceted" defimtion of communicabili- 
ty research . According to hi.m, communicability is 
(' ease 



the 



( readiness 



■'v I written, 

with which linguistic material in < i form with 



(given) 



'’cognitive') ^ I content 

characteristics of 

emotional 






Iftyle ' 



spoken 

^ f ^ 

Ide coded 
^ is I 

(encoded 



^ by members of 



a (given) population. The faceted definition may be read, then, .in ei 3^^ 
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ways by taking each member of a pair in combination with selections of one 



cease 



term in each of the other pairs. I.’, the pair 



, ease focuses 



readiness 



attention on the characteristics of the material, whereas readiness refers to 



understanding and production. The present review is not concerned with 
problems of how people produce language (except incidentally in connection 
with the problems of how appropriate instructional materials can be produced) . 

It is concerned essentially with how people (more specifically, pupils or 
students) decode linguistic material, i.e., understand it, and more than that, 
how they learn and remember the content of the material. Let us, therefore, 
adapt Schlesinger *s definition to our purposes by deleting the word "encoded." 

But we must add several phrases in order to delineate the complete scope 
of this review. The ease or readiness with which linguistic material is under- 
stood depends not only upon some of the factors already mentioned in Schlesinger ' s 
definition but also upon at least two other important factors: (l) the supporting 

context of the message, e.g., the immediate physical environment, the speaker- 
hearer relationship, or a still or moving picture that illustrates some aspect 
of the message, and (2) the manner of its presentation, e.g., whether fast or 
slow, in a single presentation or in repeated presentations, with or without 
feedback of information concerning the student's response to the material, etc. 

A description of what this review intends to cover can therefore be stated as: 



characteristics of language users. 



By including the pair - 

encoded 



By including the pair - 



'decoded 




problems of 



m . 
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ease 



; I with which linguistic material in 

[readiness J 



spoken 



form 



/cognitiveS 

with (given) \ t characteristics of 7 

I emotional J (style 



written 
content \ 



presented in a (given) 



manner. 1”“*“ \ supporting oontert. Is decoded (understood, learned, 

[without] 



remembered) by members of a (given) population. By "ease" of decoding (under 
standing, learning, remembering) we mean the degree to which there is under- 
standing, learning, or remembering on the part of the student. By "readiness" 
we mean the degree to which the student is able to understand, learn, or 
remember, as a function of his aptitudes, previous experiences . .likes . pref- 
erences. goals, etc., interacting with the content and style of the message. 

We will deal with both spoken and written messages; we will address ourselve 
mainly, however, to their cognlti^ rather than to their emotional charac- 
teristics. but we will deal with factors of both content and stjle. Presen- 
tation and contextual factors will be given attention. We discuss later 
(Chapter 3) what may be meant by "decoding." "understanding." "learning." 
and "remembering." The populations with which we will be concerned , are 
primarily populations of school learners, at any age from the klndergart.en 

to adulthood. 

This review will focus on how people learn frm language, not on how they 
learn language. While an attempt is made to point out the particular problems 
in learning from language presented in "educational media." actually the focus 
is upon learning from language in any. context, the classroom, the study, the 
library, or whatever. The only special characteristic of educational media 
that is of Interest here is the fact that ordinarily they present highly 
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standardized, controlled, and repeatable sequences of verbal discourse. (One 
can show a film a number of times, whereas a teacher's verbal output will normal- 
ly differ from occasion to occasion.) In fact, most of the research literature 
on instructional film and television seems to indicate that use of these media 
produces very much the same degree of learning as direct instruction. Much 
of this review will cover findings from the experimental laboratory or from 
observational settings where there were no special "educational media" other 
than perhaps a blackboard and chalk, or a textbook. 

It may be asked, why study learning from verbal discourse? Most of us 
live in an environment constantly filled with meaningful verbal discourse, 
and we think we understand all or most of it. In the first place, the MVD 
that we are most accustomed to and believe we nearly always understand is 
what may be called "everyday speech." The German language, in fact, has a 
special term for this kind of language: Umgangssprache . The reader may be 

reminded, however, that many kinds of language we encounter in daily life — 
editorials in newspapers, certain public speeches, etc. — ^may not be as 
readily tmderstood as everyday speech. Secondly, as educated adults we may 
fail to appreciate the enormous variations in understanding of language, on 
the part of children or of less educated adults. An examination of the 
results of almost any reading or listening comprehension test will convince 
one that the average level of performance in understanding verbal discourse 
that departs from everyday language is far from justifying any assvunption 
that pupils understand everything they hear or read. But these comprehension 
tests usually measure only immediate Understanding of language materials 
after one presentation; any teacher knows that even if the child understands 
something upon its first presentation, this does not mean that he will retain 
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it over long periods. Therefore, we mvist study not only language comprehen- 
sion but also the phenomena of learning and retention. 

Obviously, some of the failure to comprehend and retain the contents of 
verbal discourse may be attributed to the child's lack of maturity and educa- 
tion; the child fails to understand because at the time he is tested he has 
not learned enough about language and the world about him. Ordinarily, 
teachers atten5)t to choose educational media that are appropriate to the 
educational level of their classes, but it is not always easy or possible to 
do so; even if there were sure guides to assessing the verbal difficulty of 
educational materials teachers would still face the fact of considerable 
heterogeneity of verbal ability in their classes. 

It is the basic premise of the present review that pupils ' failures in 
comprehension (smd retention, insofar as comprehension is a prereqxiisite for 
it) are due at least in part to the characteristics of educational materials 
themselves, or to the ways in which they are presented and used. Verbal dis- 
course in educational media, besides being sometimes of inappropriate diffi- 
culty level for the intended audiences, is often needlessly complex, pooriy 
organized, .and poorly presented. I have tried to point out how research 
literature; suggests ways to improve the preparation and presentation of 
verbal discourse in educational media, and how there can be more adequate 
matching of educational material and media with student capacity to profit 
from these materials . The literature will be considered under the following 
'headings;' ..' 

a. Message and message soioirce variables , i.e. , variables - having 
to do with the content of the message, its phraseology, style, 
.and construction,; and its 'source.. ':( See Chapter it) . ' 
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b. Stimiilus modality factors, i.e., whether presentation is 
auditory, visual, or audiovisual, and whether it is combined 
with other types of presentations (e.g., pictorial) that 
provide , supporting context . (See Chapter 5) 

c. Presentation factors, i.e. , factors having to do with rate, 

frequency, mode, and struct\aring of presentations. (See 
Chapter 6 ) ■ ■ 

d. Phenomena, of learning and retention. (See Chapter 7) 

e . Student factors , i .e . , variables concerned with the charac- 
teristics and the educational background of the student. 

(See Chapter 8) 

The potential scope of any thoroughgoing treatment of learning from verbal 
discourse is enormous j it covers large areas of the psychology of . learning 
and the psychology of language . I must impose certain limits upon the present 
treatment : 



a 



In specialized, areas that have already been covered by 



published reviews, I will present only the or conclusions 



of these reviews , with any additional updating and. interpre 



tation that may- asern -appropriate 



b . Attention will be focused on learning ..from .MVD that is intended 



to 'instruct or at least to inform. Little attention will be 



paid ito ^D that is primarily intended to persuade students or 



to change their attitudes , except to the extent that the 



inf ormative function of ,■ such ' discourse . is ; also recogni zed 



,'r»' 



Ierjc 








1'*^* KtiUMil 



o 

ERIC 



. - 7 - 

c. Attention ■will be restricted largely to MVD put forth by a 
single sovirce, in contrast to MVD that arises in the course 
of a dialogue or a sequence of classroom interactions. Thus, 

I will be concerned usually with "one-way" commianication from 
a source to a pupil or group of pupils. 

d. I shall not be concerned with problems of language acquisition 
or with learning to read. That is, the research to be reviewed 
here generally assumes 'that the pupil is already "competent" 

to recognize the elementary units and patterns of a meaningful 
verbal discourse, whether it be in spoken or written form. 

It is difficult to state this assumption precisely, because 
there is always the possibility that even thou^ the student 
"knows the language" and "can read" (in the sense of being 
able to decode printed words into their spoken coianterparts) , 
his failiore to comprehend a particular discourse may stem 
from his lack of knowledge of particular words or syntactical 
patterns contained in it.. Thus, I will consider problems 
of language acqiiisition and comprehension that arise beyond 
the stage of "primary language acquisition" or of "beginning 
reading." 

e. I shall not be concerned with problems of auditory or visual 
deficiencies , or with conditions lander which messages are 
presented with low signal-to-noise ratio or poor fidelity, 
poor illumination or viewing , etc . Ih at is, the research to 

be considered here assumes that the pupil is capable of hearing 
or seeing the message, and that the conditions lander which 
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the message is presented enable him to do so with no essential 
loss of information. It is often the case, of course, that 
educational media that present MVD are poorly seen or heard, 
but conditions that result in such .poor seeing or hearing 
(with any consequent loss of comprehension or learning) are 
not within the scope of this review. 

Previous Reviews of Learning from MVD 

It is my intention to prepare a review that will overlap minimally, with 
other reviews of problems in learning from educational media that have been 
prepared for the WDEA Title VII Study Committee (May 1965a, 1965b, I966; 
Briggs, 1967) or a review by Travers (1967) of certain problems in audio- 
visual education. Nevertheless, I wish to point out the relation of this 
review to certain other interpretive literature summaries. 

The general problem of learning from MVD seems never to have been sub- 
jected to a thoroxighgoing literature review. There are, of course, many 
reviews and even whole textbooks devoted to the psychology of learning in 
general or to particular aspects of it, but with a few exceptions (e.g., 
Ausubel, 1963, 1968), these have not considered specifically the subject of 
learning from MVD. The characteristic approach of psychologists to problems 
of learning has been to attempt to deal with it in terms of general principles 
drawing heavily from the literature on animal learning and on human learning 
of nonsense syllables or arrays of single isolated words. Insofar as certain 
general principles m^ have relevance for the learning of MVD they cannot be 
ignored or dismissed, but discourse learning presents certain special problems 
for theoretical and general psychology that h^ been for the most part 
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* V . (The nature of these procle^ 

overlooked or sidetracked. , \ • nf 

. P Wor exainple in Keppel's (l96H) review of 

and elaborated in Chapter •) n ncrp 
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If • r^vinifiren any problem relating 
Verbal learning m childre , y 
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to learning of nonsense syllables, word lists, and the lUa. ^ 

^,3 is not to say. of course, that phenomena haying to do .rth the 

■ fui yertal discourse have escaped the attention 
learning of or from meaningful ^„^red to 

+ PC 1 + in which the subject was r^quir 
psychologists . A paragraph memory tes . 

psycnorog verhatim, was a component 

U 3 ten to a short paragraph and then repeat 

eP early intelligence tests (BinetandSimon. 1908. Temnan, 1916.. 

,80 283) vrote of the suhiective phenomena involyad 
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literature in this area appears to have 

full review of research literature . . the 

a Publish (1937) > «60 were concerned mainly with th 
nne hy Welhorn and English 

d "verb at im" and "logical" learning, 
differences Between what they called ^ 

„ , a.r^ii learning is learning of a discourse, 

(Roughly, ’verbatim learn its content 

. . amine from a discourse, i.e., 

rhTie ’'logical' learning is learning, 

Id ideas.) SW (19U0. touched cn certain prohlems of MVO learning m 
. Of .search in school learning. Althou^ a n-her of psycholo- 

= ™ learning from MVD (e.g. . Cofer, 19l*l, 

gl 3 ts have mounted research progr findings since 

+ r^ -have been no major literature 
P 556 ) there appears to have ^ 

the Welborn and English review cite a v • de 
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comprehensive literature reviews and are devoted to the exposition of a 
particular theoretical position. There is a highly useful summary by Petrie 
(1963) but it is restricted to studies of "informative speaking" and does not 
provide a detailed analysis of the literature. Reviews of "readability" and 
listenability research by Chall (1958) and Klare (19'^3) are helpful but concern 
themselves largely with certain message style variables in comprehension. 
Travers (1967 ) has reviewed literature bearing on the comparative efficiency 
of auditory and visual presentations of MVD, but his concern is mainly with 
problems of information transmission and channel capacity. The summary of 
studies in instructional television and film that was prepared by Reid and 
MacLennan (1967) is useful but is not focused on the particular problem of 
learning from MVD. 

In the published literature, then, there seems to be no comprehensive 
review of work on learning of or from MVD. 



S-'.' 



- 11 - 

Chapter 2 
SOME THEORY 

At the highest level of abstraction and yet simplicity, we may say that 
learning from meaningful verbal discotu*se takes place when some more or 

i 

less permanent change occiors in a person's conceptual structure as a result 
of his having received a verbal message, with the proviso that this change 
of conceptual structinre has some sort of veridical connection with the con- 
tent of the message. For example, when a person hears the message "Your 
house is on fire" we may suppose that he has "learned" from this message if 
he now "knows" that his house is on fire, or at least entertains a belief in 
the possibility that his house is on fire. His knowledge or belief about the 
state of his house is, presmably, a change in his conceptual structure, since 
he did not previously know or believe that his house was on fire. Any further 
response he may make, such as running to sound an alarm, or perchance saying 
"I'm delighted" (if he hoped all along it would burn down) , is irrelevant to 
the fact of learning. Now of course, he may have already become aware from 
another source that his hoiose was on fire, in which case the only change in 
his conceptual structuu*e is his knowledge of the fact that his informant 
knows this too and felt impelled to tell him. In this latter case, we would 
probably say that there was no learning, at least no learning of the content 
of the message, and it is to exclude such a case that it may be necessary to 
req^uire that the change of concepcual structiire have a veridical connection 
with the content of the message, that is, that the change corresponds to 
information built into the message. Nevertheless, even without a change of 
conceptual structure there could still be a kind of understanding of the 
message in the sense that the hearer could verify its truth or falsity or other- 
wise evaluate it. We will try to explicate some of these concepts below. 
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One idea that has been introduced is that of conceptual structure. 

Already tha use, of this phrase will signal that I tend to favor what may be 

called a cognitive account of mental activity, in contrast to the rigid 

behavioristic account that has been favored by some writers and that attempts 

to describe human behavior purely in terms of observable stimuli and responses. 

An early example of such an account, as applied to language behavior, is the 

little story that the linguist Bloomfield (1933, pp. 22-27) tells about 

how Jill gets Jack to fetch her an apple from a tree: 

Suppose that Jack and Jill are walking down a lane. Jill is hiangry. 

She sees an apple in a tree. She makes a noise with her larynx, 
tongue, and lips. Jack, vaults the fence, climbs the tree, takes the 
apple, brings it to Jill, and places it in her hand (Bloomfield, 

1933, p. 22). 

According to Bloomfield, Jill made a "linguistic substitute reaction" to her 
hunger and her sight of the apple in the tree which, for Jack, constituted a 
"linguistic substitute stimvilus" that resulted in his "practical reaction j" 
i.e., vavilting the fence and getting the apple. Bloomfield concludes that 
" lan.guage enables one person to make a reaction (R) when another person has 
the stimulus C^)" (p. 2k; italics in the original). Evidently, Jack's 
•imderstanding of Jill's speech (and presumably his learning from it) is indexed 
according to this account, by the "practical reaction" he mede that satisfied 
Jill. Obviously, this accovuit is hi^ly overs in 5 )lified; yet it is about as 
far as we can go if we restrict ourselves to observing overt responses. For 
all we know. Jack couild have been responding to a pointing gestiorej perhaps 
Jack wculd have fetched the apple even without a sign from Jill; maybe Jack 
didn't even understand Jill's language:; etc., etc. Even if we examine the 
structure of Jill's utterance (e.g., "Jack, get me an apple in that tree!") in 
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terms of other utterances Jack and Jill might exchange on. this or other oc- 
casions, i.e., the whole corpus of utterances in Jack and Jill's language, we 
might not be able to trace the connections between "practical events" and 
"linguistic substitute reactions (and stimuli)" that coiild account for the 
sequence of observed events. In fact, even the account which Bloomfield gave 
did not completely exclude certain uno>>servable variables — Jill's hunger. 



Jill's sight of an apple. 

Undoubtedly, the most extensive attempt to develop a rigorous behavioris- 
tic account of language behavior is that of Skinner (l95T)* According to 
Skinner, "the listener can be said to understand a speaker if he simply behaves 
in an appropriate fashion. ... In 'instruction' we shall see that he under- 



stands to the extent that his futiare behavior shows an appropriate change. 



These are all ways in which we are said to 'understand a language'; we respond 
according to previous exposure to certain contingencies in a verbal environ- 
ment" (p. 2TT)- Skinner goes on, however, to describe "another process" that 
is involved in understanding: 

Suppose we start to read a fairly difficult paper. We respond 
correctly to all the words it contains, so far as dictionary meanings 
go, and we are familiar with what is being talked about; still, we 
may not understand the paper. We say that we do not "get it" or do 
not "see what the writer is driving at" or why he says what he says. 

What we mean is that we do not find ourselves responding in the same 
way. The paper does not supplement verbal behavior in us which 
exists in any considerable strength. We possess each of the responses 
in the sense that it is part of our verbal repertoire, but we do not 
tend to emit it under the same circumstances as the author of the 
paper. This meaning of understand is in accord with the layman's 
use of the word. We understand anything which we ourselves say with 
respect to the same state of affairs . We do not understand what we 
do not say. We misu nderstand when we say something else with the 
same words— that is, when we behave in a given way because of the 
operation of different variables . 
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Suppose, now, we go over the paper again — as we must if we are 
ever to understand it. What processes will explain the changes which 
tahe place? Intraverbal sequences established during the first read- 
ing will, of course, leave their effect: the paper will now be familiar. 

To some extent, therefore, we will tend to say the same things. Through 
this process alone we mi^t eventually memorize the paper. But that 
would not be enough; we might still say that we do not understand it, 
though we should probably say that we now understand it to some extent. 
Other processes must take place if we are to get the point the writer 
is making. Instruction [in a special sense] . . . will probably occur. 
Some sentences in the paper will present two or more verbal stim\ili 
together in what we call definition; the resulting change in our be- 
havior will he felt when these responses occur separately elsewhere 
in the text. Other sentences, through predication, will produce other 
transfers of response by increasing our "knowledge." Our behavior 
will be altered on subsequent readings in the direction of increased 
understanding because our usage will then be closer to the writer *s 
(Skinner, 1957, p. 278) . 

A basic paradox presents itself in such a "behavioristic" account: the 

description inevitably involves subjective terms— terms that are inadmissible 
within the behavioristic framework: "we do not find ourselves responding in 

the same way" as the writer when we do not understand him. . . . When we are 
informed by definitions appearing in a text , "the resulting change in our 
behavior will be felt when these responses occur separately elsewhere in the 
text." "Our behavior will be altered on subsequent readings in the' direction 
of increased understanding. . . ." (Emphasis added.) A strictly behavioristic 
account seems \iltimately unable to deal with a person sitting quietly reading 
a book and making subjective responses to it , whether those responses repre- 
sent understanding, misunderstanding, or hopeless lack of comprehension, for 
there is little chance that one could ever trace all the consequences of those 
responses in some future behavior, particularly since some of the future be- 
havior itself would be largely unobservable. 

There have been other accounts of the behavioristic type. For example, 
Staats (1968, pp. 511 ff.) warns against thinking that "comprehension" involves 
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"some ineffable 'mental' process" and claims that instead it involves the pro- 
duction of "new sequences of classically conditioned meaning responses" on 
the analogy of sensory conditioning (p. 513). Although Staats has conducted 
much experimental work on the production of such meaning responses, there is 
at present some question as to whether his res\ilts can be accounted for hy 
a strict classical conditioning interpretation (Rozelle, I968) . In any case, 
the only advanta/;e of Staats 's account over Skinner's appears to be that it 
attempts to describe the moment-to-moment responses of the reader or hearer 
to language, even if they are tmobservable, and refer them to constructs 
arising from general behavior theory. In this sense Staats 's account repre- 
sents a transition to a cognitive type of theory that I will now present, or 
perhaps to the type of "neo-behavioristic associationism" espoused by Berlyne 

C1965). 

The cognitive view uses the data of subjective experience along with 
data from objective observations to construct a. model of mental activity that 
hopefully can he refined and confirmed by further experimental investigation. It 
views the hi^er nervous system as an entity that receives, processes, transforms 
and puts fourth information through a series of detectable stages or cycles. 

Among the proponents of varieties of cognitive theory are Hebb (19^9) j Simon 
(1957), Neisser (1967)^ and Reitman (1965). One of the essential ideas of the 
cognitive view is that the information-processor contains some sort of storage 
of memory traces accumulated (xmdoubtedly with certain transformations) from 
previous experience; this storage contains an enormous number of schemas , more 

j 

or less enduring patterns of brain-activity dealing with the individual's experi- 
ences of his own mind, his body, his sensations and perceptions, his environ- 
ment, etc. This storage is continually being added to; as new experiences 
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accumulate, they tend to have the effect of transforming or modifying the 
already existing schemas. Somewhat on the analogy of the arithmetical 
processing unit of an electronic computer, the information-processing entity 
contains a special part that is concerned with the processing of percepts 
that are formed from moment to moment; some of these percepts are selected,- 
as it were, for more or less permanent storage in memory while others may be 
held aside for later evaluation or even discard. At least one part of this 
information-processor acts as a "seat of consciousness" and processes per- 
cepts with a hi^-priority rating. Even though the information-processor may 
be thought of as consisting of separate parts, it is actually interconnected 
in an enormously complex way; it may act as if a niunber of separate sub- 
processors are operating simultaneously and yet in relation to each, other. 

Large parts of the memory are more or less immediately accessible and respon- 
sive under the appropriate conditions: for example, the memory can immediate- 

ly report recognition of any one of a large nimiber of percepts that have been 
previously experienced and return information about these percepts (Shepard, 

1967)* The whole state of this information-processing entity at any given 
moment may be regarded as the individual's conceptual structure at that moment. 

Language is the principal means of communication among the cognitive 
structures of different individuals. ■ (it is not the" only means, for other 
actions of an individual besides verbal behavior, e.g. , gestures , gross motor 
activities, etc., can provide this intercommuhication by furnishing the basis - 
of meaningful percepts to other individuals . ) "Language may also pl;?y some 
part in intra-individual . cognitive processes , such as "thinking,"' but it is 
beyond the scope pf this monograph. -to discuss this - possibility except incidentally ,i 
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in connection with language comprehension processes. At any rate, the principal 
function of language may be said to provide a system whereby one individual 

I ■ 

can attempt to modify the conceptual structure of one or more other individuaJ.s . 
That is, language provides a system whereby one individual can encode certain 
percepts into messages that under appropriate conditions evoke representations 
of their percepts in the information-processing entity of another. If A 
reports to B, "I have a headache," this does not generally cause B to have a 
headache, but it does evoke the concept "headache" which is a representation 
of past percepts of B's own headaches. 

The general model of communication and learning through language can be 
depicted in its gross aspects in Figure 2.1. x'^sychological processes in the 
originator of a message are represented on the left-hand side of the figure, 
processes in the receiver of the message on the right-hand side. Insofar as^,'^ 
the message may have any kind of permanent form (a written dociament , a tape- 
recording, etc.) the processes in the receiver may take place at any time 
after those in the originator, even centuries later. Nevertheless, the 
originator perceives some kind of occasion to communicate: he may know that 

some willing hearer is present , or assume that a potential reader will receive 
his written message. Whatever the occasion, his percept gives rise to a 
process whereby selected aspects of his momentary cognitive structure are 
encoded into a linguistic message. From the standpoint of its function, the 
message has two aspects: (l) it conveys some kind of "information," and (2) 

it has some intended stimulus value. The information it conveys may be 
regarded as a report of certain aspects of the originator 'ii momentary cogni- 
tive structure; such a report may include a report of gaps in the information 
possessed by the originator or potential gaps in the receiver's information 
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Figure 2.1. A general model of communication and learning through language 
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(aa when a teacher aska a pupil a question). The intended stlMUlus value of 

the message may embrace one or more of the following: 

(1) Drawing the attention of the receiver to some state of affairs 
represented in the originator's cognitive structure, that is, 
eliciting a corresponding change in the receiver s cognitive 



structure. 

E.g., "It's five o'clock." "John came. 

(2) Eliciting an affective response on the part of the receiver , 
whether or not a corresponding affective response is present in 

the originator. 

"How late it is!" "Surprise!" "You're wonderful! 

(3) Eliciting a further verbal response (i.e., a reply ) from the 
receiver (usually indicating a gap in the originator's information) 

"What time is it?" "Tell me your name." "What’s 2 + 2?" 
i_h) Eliciting any given behavior (cognitive, affective, or motor) 
part of the receiver. 

"Consider this fact." "Don’t feel sorry." "Write your name 
here." 

The information encoded in the message and its intended stimulus value affect 
the linguistic structure of the message, hut not in any one-to-one manner. 
That is, a given kind of information and a given intended stimulus value may 

be encoded in a number of ways, e.g.. 

What’s your name? 

Tell me your name. 

I want to know your name. 
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all have approximately the same information and intended stimulus value (to 
elicit a reply containing the hearer's name). 

On the receiver side, the receiver's momentary state of cognitive struc- 
ture, along with environmental stimulation and/or self-stimulat ion , arouses 
orienting processes that allow him to "s^ttend" to the message. If he knows 

V 

the language, he decodes it into its linguistic elements and detects informa- 
tion contained in it and some intended stimulus value. This process of 
decoding may not be either instantaneous or accurate; in any case it is af- 
fected by the receiver's cognitive structure. The decoding process produces 
a potential conceptual structure, (More detailed discussion of linguistic 
decoding occurs below.) Once the "sense" and "intended stimulus value" of 
the message have been detected (whether correctly or incorrectly) , these 
aspects are submitted to what I have called an "acceptance testing" buffer. 
This represents a postiilated process whereby the receiver decides whether 
the "sense" of the message is true or false, or otherwise worthy of further 
attention, retention, or response. The result of this "acceptance testing" 
determines how the content of the message is stored in the receiver's cogni- 
tive structure, and how it may be acted upon in future behavior. The receiver 
may decide that the message contains important new information, in which 
case it may be tagged in that way as it is stored in cognitive structure. On 
the other hand, the receiver may decide^that the information is not new, or 
false, or contradictory, or hypothetical; he may decide that the originator 
of the message was lying, or that he himself does not wish to act upon the 
intended stimulus value, in which case the information contained in the 
message will be tagged accordingly as it enters cognitive structure. The 
acceptance testing process is in any case affected by current cognitive 
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structure and indirectly by current environmental and self-stimulation. The 
outcome of the communication process is a change in the receiver's cognitive 
structure, represented in Figvire 2.1 by the part of the cognitive structure 
box labeled "assignment of new cognitive structure." This change may be 
considered an instance of learning. As determined by the manner in which the 
new cognitive structire has been tagged, it may also result in a further 
response on the part of the receiver, for example, a motor response, or a 
verbal reply (in which case the receiver becomes now an originator). But 
the cognitive structiire itself will undergo fiorther changes, over time, with 
new experiences and particiilarly , with further communicative exchanges. 

These changes also are phenomena of learning and retention. 

It will be noted that a broken line has been drawn between the environ- 
mental stimulation of the message originator and that of the message receiver. 
This is to represent the fact that even if the originator and the receiver 
live at different epochs of history, at least some featiures of their environ- 
ment are shared. For example, ancient authors may be said to have written 
about certain aspects of their environments that share features in common 
with the environment of the present-day reader — the natire of the physical 
universe and certain aspects of the social environment. Communication and 
learning have to do with changes in people's cognitive structures with 
respect to their environments: in this sense commianication and learning have 

to do with meaning or semantics . 

The above description is extremely generalized and lacking in detail; it 
is intended merely to set the stage for further exposition of a theory of 
communication and comprehension. 
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Tvo Senses of "Understanding " 

A theory of learning from MVD requires us to distinguish two general 
senses of the verb understand . As a matter of fact, these two senses are dis- 
tinguishable by semantic and syntactic analysis; rules can be stated that in 
many cases can unambiguously assign one or the other of these senses to a 
given instance of the word. 

Consider the following possible messages: 

(1) I understood "He's coming." 

(la) I understood "Er kommt" (German) 

(lb) I understood that utterance. 

(ic) I understood the broadcast. 

(2) I understood his coming. 

(2a) I understood him. 

(3) I understood he's coming. 

(U) I understood German (when I was young). 

(5) I understood carburetors. 

It is interesting to notice, incidentally, that . sentences (l) , (2), and 
(.3) differ only very slightly, yet a competent native speaker will instantly 
interpi;"!' th understood in differ’ent senses, because of the semantic 

and syntactical statuk of the groups of words that follow., 

Sentences (l)' ^d (la) clearly exemplify the sense of the verb understand 
whereby it means "to apprehend, on a particular occasion, a particular meaning 

of a message, or some presentation of a message by a person or other entity 

capable of originating a message." Let us designate this meaning as understands ^ 
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Sentences ( 3 ), (4) and ( 5 ) exemplify the general sense of the vert 
understand whereby it means "to be in a state of knowledge, competence, or 
cognitive feeling (e.g., sympathy) with respect to something." In sentence 
( 3 ), the knowledge was attained by being informed; in (4), it was attained 
by some process of language acquisition; in ( 5 ) by some process of learning 
and experience. Let us designate this meaning. as understand^ . 

Several of the above sentences are now seen to be ambiguous. 

(lb) I understood^ that utterance = I understood^ what it said, the plain 

message. 

I understood^ that utterance = I understood^ why it was said. 

(ic) I understood^ the broadcast = I understood^ the plain sense of the 

message it contained. 

I understood^ the broadcast = I understood^ why it was made. 

(2) I understood^ his coming = I understood^ what he intended to communicate 

by coming. 

I understood^ his coming = I understood^ the reasons for his coming, 

the situation that prompted it, etc. 

(2a) I understood^ him = I understood, what he said. 

1 1 

I understood^ him = I understood^ his natiare, characteristics, propensities. 

Even (.4) might be explicated either as "I was able to understand^ 
sentences in German when I was.jVoung," or possibly as "I was able to under- 
standg the nature of the German language when I was young." Actually, 
understand^ has a ntunber of somewhat different senses, as one can see by 
consulting a dictionary; the main concern here is to distinguish understand^ 
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as a special sense l-rhich can occur -when the object of the verb is a message 
or some presentation of a message. 

These two senses of imderstand correspond, in fact, to two distinguish- 
able processes in imderstanding and learning from verbal discourse. Under- 
standing^ refers to the process of apprehending the "plain sense" and 
intended stimulus value of a message, while imder standing^ refers to the 
knowledge in cognitive structure that may result from learning from all kinds 
of experience, including verbal discourse. Although the distinction may 
seem obvious or trivial, it is one that has not always been, properly observed 
in research on learning from verbal discourse. Some researchers have been 
concerned solely with understanding^ , but many have been concerned with 
understanding^ without realizing that understanding^ is often a prerequisite 
for imders t anding^ . Even the study of understanding^ entails concern for 
understanding^ because an individual ,’s understanding of a message often 
clearly depends upon his prior state of knowledge with respect to the content 
of the message. 

The distinction also has implications for deciding how to measure 
understanding and learning. In an ideal communication situation — at least, 
ideal for the transmission of knowledge — aspects of the originator’s cogni- 
tive structure would be transmitted or exactly replicated in the receiver's 
cognitive structure. Thus, Einstein might have been able to communicate all 
his knowledge about relativity to a learner in such a way that the recipient 
had the same cognitive structure with respect to relativity as Einstein. 
Obviously this could never have happened, for there would have been informa- 
tion losses (and gains) at various points in the communication process . It 

■29 



•a. . 



is doubtful that even Einstein could have encoded his cognitive structure 
without information loss, both because language may be an imperfect instru- 
ment for such encoding and because Einstein might not have been able to 
select or retrieve precisely the information that a given learner, might need. 
Even if precisely the right information had been perfectly encoded by Einstein, 
it is unlikely that a given learner would have been able to decode Einstein's 
messages with perfect fidelity-, or, once decoded, to integrate the decoded 

s 

messages into his own cognitive structure without various losses and gains of 
information. Einstein's understanding^ of relativity could not correspond 
exactly to the learner's understanding^ of relativity, because the learner 
started with a different cognitive structixre from Einstein's . Nevertheless, 
we might content ourselves with a measurement of the learner's understanding^ 
of relativity before and after he received instruction from Einstein, to 
assess the effect of Einstein's messages about relativity. Even this would 
be difficult, for there is no siire way of measuring the contents of a person's 
cognitive structure. We can only probe cognitive structure by using the 
learner partly as a source of further messages and responses and partly as a 
recipient -evaluator of messages. From such probes we might be able to build 
up evidence from which we could make at least some inferences about the 
learner's understanding^ of relativity. 

Here is the attempt of two educators to summarize techniques of measuring 
understanding^ on the part of learners (Findley and Scates, 1946, p. 64). 

1. In every subject-matter area there are available at present many 
well-known procedures for the evaluation of understanding. 

2. To provide evidence of understanding, evaluation situations must 
contain an element of novelty, but not too much novelty. 
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3 . Understanding is of many kinds and many degrees, and evidence 
is to be soTight on appropriate levels. 

k. Procedures employed to measure understanding should provide 
evidence of appreciation of primary reality. 

5. Since intelligent behavior in many situations involves the 
ability to recognize the relevancy and sufficiency of data, 
evidence of this ability should be soioght. 

6. Evidence of understanding is to be found in originality of per- 
formance on the part of pupils . 

7 . Evaluation procedures should be selected with due regard for the 
likelihood of their evoking evidence of the kind of understanding 
that is required. 

8. In obtaining evidence of understanding, care should be exercised 
to insure that the pupil's response reflects his actual level of 
understanding. 

9. The program of evaluation should be planned so as to foster the 
development of habits of self-appraisal on the part of pupils. 

A much more limited objective is to try to measure an individual’s 
understanding^ ^ of a message . We do not require that the learner fully accept 
the content of the message, or learn it in the sense of putting it in more 
or less permanent storage; we simply wish to find out whether he has under- 
stood^ the message "as it stands." To say that an individual can understand^ 
a message "as it stands" requires the ass\amption that the message itself 
contains a "meaning" which is derivable solely from its linguistic structure. 
It may appear that the bulk of messages encoxintered in daily life or in 
ordinary reading do indeed contain such meanings , and it may be that some do . 
Upon analysis, it will be found that not , all sentences or utterances are 
unambiguous by themselves; they are usually disambiguated by some sort of 
surrounding "context" of either a verbal or non-verbal character— context 
that the recipient ' can take .account of in interpreting, that is , finding a 
meaning of the sentence. (This may or may not be the "intended" meaning 
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encoded by ohe originator of the message.) If the recipient of a message is 
permitted to have enough contextual information he should be able to arrive at 
the one most likely "reading" or interpretation of the message. There will, 
however, remain a small residue of messages that are not disambiguated even 
by the context. Chapter 3 will svirvey the various methods that have been 
employed to measiore understanding^ of messages . 

Theories of Sentence Comnrehension (understanding^^) 

After a long period in American linguistics during which problems of syn- 
tax were largely neglected, the theory of transformational generative gram- 
mar developed by Chomsky (195T» 19^5) has come to dominate the thinking of 
psycholingTiists concerned with processes of sentence understanding and pro- 
duction. Vfliile transformational generative grammar does not itself aim to 
explain or otherwise account for the actual behavior or performance of speakers 
hearers , readers , and writers in using Ismguage , it does aim to provide an 
abstract model of the so-called competence of these language users. Presuma- 
bly, the language user's competence plays some role in his use of language; 
exactly what that role may be is, in fact, the task of the psycholinguist to 
discover. 

, , t . 

A brief exposition of kqy concepts in the theory of transformational 
generative grammar* will be useful to the reader in understanding some of the 
subsequent discussion. According to Chomsky and his followers , a grammar of 
a language is a finite set of rules that will generate any one of a poten- 
tially infinite number of sentences that will be accepted by users of the 
language as . "grammatical" and none of the sentences that would be rejected by 
language users as "ungrammatical." Hence, the theory of the grammar of a 
language is a theory of what the language user "knows" in order to generate 
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and understand grammatical sentences, that is, a theory of his "competence." 
The criterion of grammaticality is thus the intuition of the idealized 
language user — one who has absorbed in some way the rules of the language 
and can reflect them in his use of the language. 

The formulation of transformational grammar has~^ undergone a number of 
changes since first proposed by Chomsky; in fact, it is still undergoing 
change. In a brief, statement prepared hy Chomsky, the grammar of a language 
is characterized as 

a system of rules that determine a certain pairing of sound and mean- 
ing. It consists of a syntactic component , a semantic component and 
a phonological component . The syntactic component defines a certain 
(infinite) class of abstract objects (D, S_) , where D is a deep 
structure and ^ a siirface structixre . The deep structure contains all 
information relevant to semantic interpretation; the surface structure , 
all information relevant to phonetic interpretation. The semantic 
and phonological components are piu*ely interpretive. The former 
assigns semantic interpretations to deep structures; the latter as- 
signs phonetic interpretations to surface structures. Thus the 
grammar as a whole relates semantic and phonetic interpretations, 
the association being mediated by the rules of the syntactic compo- 
nent that define paired deep and surface structures .... 

This formulation should be regarded as an informal first approxi- 
mation (Chomsky, 1967,. pp. U06-U07). 

Later, 

. . .the linguistic evidence now available seems to point consistently 
to the conclusion that the syntactic component consists of rules 
that generate deep structtires combined with rules mapping thesse into 
associated surface structures. Let us; call these two systems of rules 
the base and the transformational components of the syntax, respec- 
tively. The base system is^ further divided into two parts; the 
categorial system and the lexicon (pp. Ul9-^20). 

As a concrete example,; Chomsl^ takes as a base system a small subset, of 
English consisting of a lexicon: ; 
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it, fact, John, Bill, boy, fut\ire (Nouns) 

dream, see, persuade, annoy (Verbs) 

sad (Adjective) 

will (Modal) 

the (Determiner) 



and a set of "re-write" rules in the categorial system: I 

S -> (Q) KP AUX VP [read: Sentence may be rewritten as ^ . 

(Question), Noim-Phrase, ' 

Auxiliary, Verb-Phrase] 

VP -> be ADJ 

VP -> V (UP) (of NP) 

NP -> (DET) N (that S) 

AUX -> past 
AUX -> M 

N, V, ADJ, DET, M -> A (where A represents any "terminal" element 

in a surface structure) 

and proceeds to show how such sentences as John was sad and The boy will 
persuade John of the fact that Bill dreamt can be derived or "generated" 
therefrom. For example, the derivation of John was sad can be represented 
by a "tree diagram" ac follows: 




IjP 
N 

I 

John 

i (The formative was is derived from past ^ by a supplementary transforma- 
tional rule.) 

A tree diagram thus represents the relation between the "deep" and the 
"surface" structiares of the sentence. It also represents the information 
required for semantic interpretation of the sentence. For Chomsky, "competence 
involves the ability (implicitly) to assign "structural descriptions" to sen- 
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A famous example may make this clearer. 

(1) John is eager to please. 

(2) John is easy to please. 

Although these sentences appear to have similar "surface" structure, their 
"deep structures" are different, as shown by the fact that we can convert (2) 
into another form: • 

(2a) To please John is easy 

but we cannot similarly convert (l) to * To please John is eager without destroy- 
ing the meaning. If we follow Chomsky's doctrine, the "base structure" of 
(,l) derives John from a noxon phrase that is subject of a verb phrase is eager 
to please , while the base structure of (2) derives John from a NP that is the 
oti.lect of a verb to please in a deep- structure verb phrase ( To please John is 
easy ) . According to Chomsky o\ir "internalized grammar" is automatically cog- 
nizant of these grammatical relationships. 

In order to make possible such recognition, of course, "competence" must 
include a sort of "dictionary" in which the possible lexical and grammatical 
features of the formative elements (words, affixes, etc.) of the language can 
be looked up and retrieved. It must also contain some representation of the 
rules by which base structures are realized in surface structures — not , to 
be sure, a completely conscious knowledge of these rules. Chomsky and his 
followers are silent as to the actual psychological status of these rules; 
this' is an issue that is regarded as outside the province of linguistics. 
Chomsky's object is simply to formulate the grammar (including syntactic, 
semantic, and phonological components) in such a way that it will most 
parsimoniously achieve the object of being able to generate (or assign 
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structural descriptions to) all the grammatical sentences of the language and 
none of the ungrammatical ones. 

Chomsky's transformational generative grammar has given rise to a truly 
enormous literature in linguistics— including applications of the theory to 
special problems in the grammai’ of English and many other languages, further 
developments of theory (e.g., Katz and Postal, 1964), and critical discussions 

(see the bibliography by Dingwall, 19^5) • 

Chomsky’s discussions of the distinction between "competence" and "per- 
foimance" have implications for the field of psycholinguii-otics . A genera- 
tive grammar," he says "is not a model for a speaker or a hearer. It attempts 
to characterize in the most neutral possible terms the knowledge of the 
language that provides the basis for actual use of language by a speaker- 
hearer. . . .When -we say that a sentence has a certain derivation with 
respect to a particular generative grammar, we say nothing about how the 
speaker or hearer might proceed, in some practical or efficient way, to 
construct such a derivation. These questions belong to the theory of language 
us-a— the theory of performance" (.Chomsky, 1965, p. 9) • In brief remarks 
"towards a theory of performance" he carefully distinguishes between "gram- 
Eiaticality" and "acceptability," the former a property of sentences formed 
by a grammar, the latter a property of sentences that are "perfectly natural 
and immediately comprehensible without paper-and-pencil analysis, and in no 
way bizarre or outlandish." He suggests that profitable studies of acceptabili- 
ty mi^t consider the role of certain grammatical phenomena, such as nested, 
self-embedded, multiple-branching, left-branching, or right-branching con- 
structions. (As will be seen in this monograph, many studies of these phenomena 
have now been performed.) 
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Dioring the early years of the 196O' s, popular research prol/iem among 

/ ■ ■ 

psychologists was the attempt to demonstrate the '’psychological reality," of 
various grammatical phenomena, in particular, certain "transformation rules" 
such as passivization, negation, and- quest ion- format ion Unfortunately , 
although this work seemed to produce interesting results, its basis has now 
come under much questioning, partly hecause of modifications of transforma- 
tional theory and partly because of flaws in experimental procedure and, 

1 

design. This monograph .will review, in Chapter the present status of some 
of this work. 

For current opinion on the theory of . performance, I draw on the report 
of a conference held in Edinburgh, March 1966 (Lyons and Wales, 1966). I 
emphasize those aspects of the discussion that relate to the u nderstanding 
of language. Of particular relevance here are papers hy Thorne, by Wales and 
Marshall, and hy Fodor and Garrett. I will try to summarize the discussion 
in terms of a number of major issues. 

1. What is the nature of a theory of competence? From the standpoint 
of the linguist,' a theory of competence is essentially an axiomatization of 
the rules of a language, similar to an axiomatization of the rules of the 
number system. As such, it is an abstraction. In saying that the rules of a 
language "generate" sentences, the linguist uses the term generate in a purely 
formal sense: this phraseology makes no statement as to whether in the normal 

use of language individuals generate sentences according to such rules . Never- 
theless , it can he pointed out that a theory of competence is "psychological" 
at least in one sense: that it "purports to be a principled account of the 
linguistic knowledge of human beings rather than a totally ad hoc description of 
the language" (Wales and Marshall, 1966, p. 29). Chomsky has distinguished two 
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levels of descriptive adequacy of grammars; (l) WEAK "descriptive" power — whether 
all and only the possible terminal strings of a language are generated; and ( 2 ) 
STRONG generative or "explanatory" power — whether the structure assigned to these 
strings describes correctly how the idealized native speaker would understand 
these strings. Particularly in the evaluation of grammars as to their STRONG 
generative power, then, it would seem that a theory of competence involves 
statements about language use, i.e., the understanding of sentences. It seems 
clear, then, that there is at least a very intimate and perhaps inextricable 
relationship between a theoiy of competence and any theory of performance. 

It is agreed, in any case, that a theory of performance must presume an 
adequate competence model, i.e., an adequate axiomatization of the language. 
Experiments concerning speaker-hearer performance must be designed and inter- 
preted in the light of such a model. 

[It may be noted that Schwarcz (1967) has protested against the assump- 
tion that there can be an "idealized speaker-hearer" whose competence is 
formalized, because such a concept is a fiction. He suggests that this con- 
cept be replaced by that of the "typical speaker-hearer" — "a set of basic 
mechanisms for understanding, using, and learning language, plus a memory 
structure for the storage of both linguistic and nonlinguistic facts." In 
essence, Schwarcz rejects a theory of competence xinless it is subsumed under 
a theory of performance.] 

2 . V?hat would a satisfactory "theory of performance" be? A preliminary 
definition is given by Wales and Marshall (1966, p. 30 ): "It is a theory of 

how, given a certain linguistic competence, we actually put it to use — realize 
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it, express it. It is also a theoiy of the limitations of the mechanisms, 
■which enable us to express our linguistic competence. ... We want to be 
able to explain NORMAL performance — when the translation from competence 
to performance is proceeding smoothly — just as much as we want to explain 
errors and deviations.” As a theory . a theory of performance may be as 
much an abstraction as a theory of competence, but the abstract quality of 
any theory is precisely what gives it its generalizing power. A theory of 
performance might, according to Wales and Metrsheill, consist of two parts: 
a part concerned with the general type of system that makes competence and 
performance possible, and a part concerned with the specific mechanisms in- 
volved. The task of the psycholinguist is to discover these mechanisms. 

The theory might include an algorithm that would describe the manner in which 
the individual processes information either in sentence production or in 
sentence understanding. (A tentative algorithm has, for example, been pro- 
posed by Dewar, Bratley, and Thome ( 1969 ) which reasonably simulates certain 
aspects of sentence tmderstanding. ) 

3. Is it profitable at this stage to develop models or schemns nf 
lingxrLstic performance ? Wales and Marshall (1966, p. 55 ) propose such a 
schema, reproduced in Figure 2.2. They do not claim it to be a MODEL, however, 
offering it only as serving to indicate the hypothesized order of processing 
linguistic information and to suggest points for study. For sentence under- 
standing, it is to be read from the bottom up; for sentence production, from 
the top down, it assumes that the basic tmit of linguistic performance is 
the sentence, rather than the word; that the analysis of sentences is con- 
tinuous, rather than operating on input strings in tenporetry stores; and 
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Figure 2.2. A schema of linguistic performance (Wales & Marsheill, I 966 ) 



O 

ERIC . 



40 . 






I 






-36- 



that "at any given time, the process operates only uni-direct ionally — that 
is, recognition and production procedures cannot be simultaneous." (It may 
be commented that this last assumption is counter-intuitive; certainly during 
sentence production there are processes whereby one recognizes the sentence 
being produced.) Nevertheless, Blumenthal comments in the same volume (p. 84) 
that Wales and Marshall's schema is "too lofty an abstraction to be of heuristic 
value" in suggesting techniques, mnemonics, and cues that the language user 
employs. He also feels that it is counter- int\oitive in suggesting that input 
processing proceeds from surface-structure to deep-structiare to semantic 
interpretation. In this very comment Bliunenthal demonstrates the usefulness 
of such schemas in raising issues. % own recommendation is that we continue 
to propose and test schemas of this sort, making them as complicated as the 
data warrant. 

For conparison, a considerably more complicated schema (or "model") of 
sentence construction proposed by Banks (1969b), Figure 2.3, maj” be examined. 
Banks is concerned with the processing not only of "normal" well-formed 
sentences but also of various kinds of deviant sentences. For this purpose 
he introduced "Ziffian" rules (Ziff, 1964) to allow the individual to find 
the most probable path to a well-fonned sentence. Notice also that Banks 
introduces "context" as additional input, and that the output is an "idea." 
Presumably this "idea" is what gets stored in Wales and Marshall's "conceptual 
matrix." A somewhat similar schema is proposed by Schwarcz (196 y) in a pair 
of "flowcheurts" for linguistic performance. Figure 2.4a is analogous to 
Banks ' schema for sentence processing, showing the output as a "conceptual 
structure." In Figure 2.4b this conceptual structure is taken as input for 
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Figui’e 2.3. A schema of sentence comprehension (Danks, 1969b) 
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(a) SEMANTIC INTERPRETATION OF AN UTTERANCE 
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further processing depending upon whether the sentence is interrogative or 
declarative, and depending upon whether the information in the sentence aroixses 
a "curiosity motivating condition." Thus, Schwarcz introduces a feature some- 
what similar to the "acceptance testing buffer" I have postulated in Figure 2.1. 

Even without schematic diagrams, it is possible to speculate about some of 
the detailed processes in sentence understanding. A spoken sentence input to the 
hearer inevitably comes in a temporal sequence from "left to right" but there is 
obviously some possibility for "re-scanning" material already heard and stored 
in temporary short-term memory. Printed sentences are normally read from left- 
to-right (leaving aside the reading methods advocated by some "speed reading" 
courses), but there is much more opportunity for rescanning. In any event, 
there is room for investigation of how the hearer/reader is able to perceive or 
"compute" deep structure from surface structure. Does he build a tree diagram 
"from left to ri^t" and from "top to bottom," or the reverse? Does sentence 
processing proceed in any such straightforward fashion at all, in either direc- 
tion? Various superficially plausible models for sentence processing have been 
proposed by such theorists as Johnson (1965)* Osgood (1963), and Yngve (1960), 
but the present consensus seems to be that none of these models are even approxi- 
mately correct. It seems best, for the time being, to wait for further theoriz- 
ing and experimental data before fixing upon a detailed model. 

One type of model that seems particularly objectionable is the "analysis-by- 
synthesis" model originally proposed by Matthews (1962) whereby the sentence 
processor generates multiple possible "synthesized" sentences from the input and 
then selects the sentence structure that matches the input. Fodor and Garrett 
(1966, pp. 139-lUl) show formally that such a device could not possibly operate 
in real-time becaxise of the enormous number of searches and matchings that would 



be involved. 
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4, Is it necessai*y for the hearer to arrive at a "fu ll structural 
description" of a sentence in order to understand it? By a "full structural 
description" is meant an assignment, by the hearer /reader , of each word or 
other linguistic element to some position in the grammatical structxire of 
the sentence — e.g. , that a certain phrase is the subject of the sentence, 
that a certain word or phrase modifies it, that a certain part of the 
sentence is the predicate, that a certain adverb (e.g., probably ) modifies 
the whole of the rest of the sentence, etc. (There is a fxxrther question, 
with which I will not deal here , as to whether the "full structural descrip- 
tion" involves perceiving the "deep structure"; for example, in hearing the 
sentence The bov was hit by the ball does the hearer have to recognize that 
this is a transformation of a sentence The ball hit the boy? Obviously , the 
hearer must recognize that the causal agent was the ball, not the boy, but 
the question becomes one of whether sentence perception actually involves 
recognizing a transformation.) 

Fodor and Garrett (19^6, p. 1^2) give a most confident affirmative to 
the question raised above: "That it is the full structural description of a 

sentence which is the psychologically pertinent output of a recognition 
device is not now open to seriotis doubt . It is only in terms of the rela- 
tions the stmctural description marks that such intuitively— available notions 
as grammat icaQ.it y aind syntactic ambiguity can be reconstructed, and only by 
reference to these relations that a general characterization of syntactic 
similarity between sentences can be formulated. To put it sli^tly differ- 
ently: the struct xn*aJL descriptions assigned by generative grammars auto- 

matically provide formal counterparts for grammatical relations, the recog- 
nition of which lies within the perceptual capacity of speakers. This fact 
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can be explained only if we assume that the perceptual recognition of sentences 
involves the recovery of their structural descriptions." Two discussants, 
however, are not convinced; "Just as the logician makes use of heuristic 
devices in proving theorems, so it seems to me certain that the human brain 
must do so in recognizing and producing sentences. It does not seem to me to 
have been proven that all sentences must be completely decomposed into their 
deep structure in order to be uttered or understood. It seems possible that 
performance may be controlled more by a system of analogies than by a more 
rigorous generative procedure in which the axioms of linguistics are directly 
represented in the brain" (Sutherland, 1966, p. 161). This idea is exempli- 
fied by reference to producing utterances: "For example, if the brain cai; 

categorize words into types , new sentences could be formed not by directly 
looking up a very general rule but by looking up an instance of the use of a 
word of a similar type," but an analogous argument might be made for speech 
understanding .■ 

Another discussant; ". . .1 really cannot see why the mechanism of a 
hearer *s understanding need be supposed to produce a full structural descrip- 
tion for each wave-form understood; it does not seem even to have to produce 
all the transformation-markers (e.g. , semantically redundant displacement 
markers , as in phone up phone . . . up , can be omitted) , let alone the 
phrase— markers" (Cohen, I966, p. 169). Cohen goes on to state that producing 
a full description would be "an extraordinarily uneconomical procedure," 
considering the vast number of messages we are exposed to. He proposes that 
ve "look for the most economical means of storing information for the purpose 
of showing that we do understand it." 
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The issue that is joined here seems to be marked with confusion as to 
the contexts in which sentences are understood. Cleariy, Fodor and Garrett 
are correct in insisting that understanding implies a full structural descrip- 
tion when the hearer /reader attends carefully to every word of an utterance; 
the fact that even the omission or misplacement of a word is likely to be 
detected xmder such circumstances suggests that the hearer/reader apprehends 
the "full structural description." Even in carefxilly attending to a message 



composed in telegraphic style, as a headline, the reader infers a structural 
description that specifies every significant relation among the words of the 
message. Now, Cohen seems to be speaking of conditions when the hearer/ 
reader does not attend to every word~as through momentary lapses of attention 
or in rapid scanning of a text. Under these conditions, it is probable that 
the hearer/reader still infers something like a full structural description 
of the material he attends to , filling in certain gaps from his previovis 
knowledge or by purely logical processes that are a function of the redun- 
dancy of the message. I conclude that Fodor and Garrett are correct, in 
principle, but that Cohen has introduced the important idea that complete or 
nearly complete structural descriptions can be produced on the basis of limited 
information. There is no .guarantee, of course, that such structural descrip- 
tions will be as correct as they are likely to be if the full text is attended 
to. An interesting research problem would be to study the structiiral descrip- 
tions attainable on the basis of limited information, e.g., in responding 
to "telegraphic speech" or randomly scrambled words . 

In the coiarse of his discxission, Conen introduces a seemingly plausible 
model for speech understanding that may be worth investigating. He finds 
this model consistent with a wide range of experimental data: 
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So the hearer's mechanism I am proposing is one that will map wave- 
forms on to memory-storage instructions. Such a mechanism must be 
capable of recognizing occurrences of those morphemes and combina- 
tions of morphemes (i.e., nouns, verbs, adjectives, etc.) that consti- 
tute categories under which information is usefully stored alongside 
established relevant rules for identification, individuation, infer- 
ence, and so on; and it must be capable of distinguishing those 
morphemes from morphemes that are not of this kind (i.e., articles, 
conjunctions, etc.). It must also be capable of reversing certain 
transformations that have taken place in the generation of the utter- 
ance, in order to identify the appropriate filing categories (e.g., 
reversing displacements, like George put his own friends up from GeOrge 
put up his own friends ) , and breaking down logically compound sentences 
into their constituent kernels plus the relations between these. It 
must be capable of filing under each appropriate category a morpho- 
phonemic description of the kernel sentence or sentences plus 
transformation-markers which CAN be processed for a full structural 
description if the hearer needs to show, or utilize, his understand- 
ing in a way that requires this processing. And the hearer's mechanism 
must also be capable of treating its description of the wave-form as 
a cross-reference to other filings of the same wave-form, and of 
filing alongside this description a description of certain contextual 
circumstances of the wave- form's utterance (in order to identify the 
denotations of personal pronouns, demonstratives, etc., and to assist 
in residual disambiguation; I assume that in most cases contextual 
circumstances will have determined the initial filing of polysemes). 

... In short, what I am suggesting is that for a hearer to under- 
stand a speaker's utterances correctly is to file a partial descrip- 
tion of it imder the same memory-storage categories , and to be 
prepared to take to at least some extent the same linguistic and 
non-linguistic action on it , as the speaker would be prepared to 
take if the roles were reversed. To misunderstand is to file under 
different categories, or to file a misdescription of it; and to 
fail to understand it is not to file it at all, or not to file a 
description of it that is adequate for the purposes of eliciting 
implications, answering questions, checking truth- value s , and so on 
(Cohen, 1966, pp. 169-170). 



5. What is the difference between recall of a sentence and understand- 
ing it? Obviously, purely on the basis of immediate memory span a string of 
words (provided it is not too long) can be recalled without imder standing it. 

A. large proportion of the experiments that have been done on sentence process- 
ing have not required true \anderstanding of the sentence; they have required 
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only ''learning” and recall. Blumenthal (1966, p. 83 ) suggests that under- 
standing is not necessary for memory; it only makes a sentence easier to 
remember. The criterion of "understanding” still stands as the recovery of 
the underlying grammatical structure, as well as the accompanying semantic 
information. To the extent that words perceived without synta.ctic structure 
convey semantic information, some of this semantic information may be recovered 
in "pure recall" and certain syntactic constructions imposed on this informa- 
tion in the process of recall . This would be a case of "pseudo-tmderstanding” 
since the constructed syntactic information might in fact be incorrect. It 
may be found quite diffic\ilt to separate understanding from recall in experi- 
mental. work. The most successful procedure appears to involve making the 
subject’s task one in which he must submit the sentence input to some verifi- 
cation procedure with reference to a non-linguistic stimulus — e.g., a picture. 
(Chapter 3 will discuss this matter more fully. ) 

If the sentence presented is understood in the sense defined here, an 
interesting question has to do with what, precisely, is recalled at some 
later point in time. An experiment by Mehler ( 1963 ) suggests that the base 
structure and the transformational rules converting to surface structure 
are remembered separately, the beise structure being generally remembered 
longer etnd better. (Later, we shall adduce more evidence for this sort of 
finding, with the suggestion that actually something deeper even than base 
structure — some non-linguistically coded "meaning" — is remembered longest.) 

6. What grammatical variables Influence sentence processing? A large 
literature on this topic is now available. Among the major conclusions which 
seem reasonably well established are the following; 
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a. Hearer /readers tend to process sentences in terms of their 



constituents . For exau^Jle, Anglin and Miller (1968) fotmd that sentences were 
more easily learned when their words are grouped according to syntactic con- 
stituents rather than otherwise: "The bcfy found it/in the woods" would be more 

easily lesimed than "The boy found/it in the woods." A number of experiments 
have shown that in a dichotic listening situation where a sentence is heard in 
one ear and a click is heard at a certain point of time in the other ear, the 
subjective placement of the click tends to be displaced towards boundaries 
of syntactic constituents. Schlesinger (1966b) found that the eye-voice-span 
tends to extend to the end of a possible constituent chain. 

b. Certain aspects of deep structure, particuleirly the logical 
subject of a sentence, influence recall and understanding more than elements 
of surface structure . Blumenthal (1967) found that the logical subject was 
a more efficient prompt than the nonagent phrase in remembering sentences 
such as The gloves were made by tailors vs . The gloves were made by hand . 

c . Some failures of understanding are due to incomplete analysis 
of the input . For example, interpreting The boy was hit by the girl as 

% 

equivalent to The boy hit the girl can occur when the subject is under pressure 

(Slobin, 1963). 

d. Sentences with self-embeddings are harder to understand or 

remember than their right-branching equivalents . Representative materials 
were studied by Miller and Isard (196U): A sentence with no embeddings ( She 

liked the man that visited the jeweler that made the ring that won the prir-e 
that was given at the fair ) is more easily processed and learned than one. with 
3 embeddings (The ring that the jeweler that the man that she liked visited 



made won the prize that was given at the fair ) . 
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e. Syntactic cop t'' ^ measured by number of transformatio ns 
in the derivation of surface struc ture from base structure is, however, n ot 
always a sure guide to eas e of sentence processing. Results of a number of 
early experiments on such transformations as passive, question, and negation 
were flawed by confounding of these variables with sentence length, meaning, 

etc., in the opinion of Fodor and Garrett (1966) . 

f. Violations of semantic select i on rules in "semi-sentences ^ 
rasult in poorer- sentence nrocessir^ . For example, an anomalous sentence such as 
"Pink accidents cause sleeping storms" is less well remembered than a "normal" 
sentence such as "Pink bouquets emit fragrant odors" (Marks and Miller, 196l»). 

It may be said, however, that "semi-sentences" introduce a type of semantic 
complexity or distortion that is not merely a matter of violating selection rules. 
Semantic complexity is also introduced by negation (Wason, I 961 ) , unless the 
negation is used merely to emphasise that a fact is contrary to expectation , . . 

(Wason, 1965 ) • 

Many of the ahove conclusions vlU he examined more closely, and the 
evidence updated, in later chapters of this monograph. A numher of remarks 
seem appropriate here, however, as comments on the motivation and presuppositions 
of the research on sentence processing reviewed in the various chapters of the 

symposium edited hy lyons and Wales (1966) : 

Obviously the motivation for this research is to gain data for making infer- 
ences about the processes or mechanisms in the understanding of sentences. Inci- 
aentally, some of it may provide insight into the nature of linguistic competence, 
hut if llngtilstic competence is simply the speaker/hearer's knowledge of his 
language , and if that competence can he represented as a formal axiomatic system 
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that can be verified independently of psychological exjjeriments , we can expect 
such experiments to throw little light on linguistic systems . Our main expecta- 
tion from psychological experimentation that has been reviewed here is that it 
will enable us to construct and refine a theory of linguistic performance . 

In the experimental settings that have been employed, there is admittedly 
a good deal of artificiality necessary in order to permit adequate control I 

of variables that might otherwise affect the results. Some elements of this | 

artificiality are: I 

i) Typically, the subjects are normal, reasonably well educated native I 

speakers of English. Few experiments on processes of sentence understanding 1 

have been conducted with children, aphaiiics, schizophrenics, or other special 
X>opulations . (This is not to deny that there is a large literature on the 
language of children, aphasics, etc.; the point is that little of this litera- 
ture contains experiments on processes of sentence understanding.) 

ii) Typically, the sentences presented to the subjects are quite ordinary 
sentences using high- and medium-frequency words; thqy are presented as self- 
contained, isolated sentences; if a number of sentences are presented, they 
are unrelated in content. (A few experiments present "deviant" sentences of 
various kinds, but again, these are presented in isolation and they usually 
contain relatively faniliar words or construction.) The content of the 
sentences is very ordinary. They are only "hypothetically" informative; a 
subject in an experiment is very unlikely to want to add to his permanent 
memory store the content of a sentence like "The boy hit the colored ball" ; 
it wotild be only by an exercise of imagination that the subject cotild conceive 
a situation where such a sentence would be truly informative. 

iii) Sentences are ordinarily presented in the absence of any context with 
which they might otherwise be accompanied. The subject has to Icam a sentence 
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like ''The boy hit the colorful ball" without being informed vhat boy and what 
ball are being spoken of. Exceptions to this observation are provided by a 
few experiments that employ pictorial context as referents for sentences that 
are to be verified. Also, a few experiments exemplify the use of materials 
that are inherently meaningful without context, such as. true or false sentences 
about the number system ("Five is smaller than two"*, "five is an odd number"; 

"Five precedes thirteen"). 

iv) Sentences are presented for icaiiedlate tinderstsading or irasediate recall, 
onU very rarely for recognition or recall after a considerable time-period. 

v) Motivation of subjects is typically hi^, at the level one would expect 
in an experiment where subjects are paid volunteers who are alert and eager to 

please the experimenter. 

One wonders whether the results of experiments conducted under such 
artificial conditions will easily generalize to "real-life" situations in- 
volving other than the normal, educated speaker/hearers who are the subjects 
in these experiments, and involving meaningful verbal disco^irse that consists 
of multiple, connected sentences with ample contextual determination. Even 
if we consider only single sentences, it is conceivable that in "real life" 
with appropriate context a conplex self-embedded sentence like The race t h ^ 
the car that I sold von was held last summer (adapted from an example given 
by liilier, 1962b, p. 755) would be much more easily understood than it would 
in a psychological experiment. (See also Freedle and Craun, 1970.) 

On the other hand, in principle everything we would want to know about 
sentence understanding could arise from the study of single sentences , be- 
cause since single sentences are according to transformational grammar (and 
"common sense") infinitely expandable by recursive rules, a single sentence can 
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itself contain all the contexttial information necessary for its understanding. 
Whether this principle can be sustained in a report that is concerned with 
the role of meaningful verbal discourse in audio-visual educational media 
(with their paraphernalia of non-verbal presentations) is left to the Judg- 
ment of the reader. 

Comprehension vs. Inference 

The kind of "sentence comprehension” that has been discussed up to now 
involves "assigning stuctural descriptions” to the elements of the sentence. 

It involves also understanding the "meanings" of the separate components, 
including rare or technical words such as ferrule, soffit, or transduce r, if 
they happen to occur in the sentence. Comprehension of a longer discourse 
such as a paragraph or an essay would involve not only these processes but 
also identifying the persons, things, ideas, etc. that are referred to one or 
more times throu^out the text, even though in different words, and following 
the development of more complex ideas. We have been, in short, discussing 
"comprehension" as understanding the "plain sense" of a message. 

"Comprehension" is, however, often used in a much looser sense to in- 
clude both understanding^ ^ and understanding^ as they were defined in an earlier 
part of this chapter. An examination of a test of "reading comprehension" or 
of "listening comprehension" will usually show that the test is designed so 
that the individual’s score will reflect not only his ability to understand the 
"plain sense" of the material but also his ability to make inferences, i.e., 
to create new information that is implied by the plain sense of the message. 

Various instances of simple inference can be given. Consider the 
sentence .Tohn -is taller than Mary, and Dick is shorter than Ma gr- It is 
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conceivable that understanding the ''plain sense" of this sentence would no^ 
include the inference that Dick is shorter than John ; the relations between 
John and Mary, and between Dick and Mairy, might be "understood" without the 
further tinderstanding of the relation between John and Dick. 

Inference is also involved in syllogistic reasoning; Ml A are All 
B are C; Therefore all A are C . Consider the following paragraph, used by 
Frase (1969c) in an experiment on paragraph reading: 

The Fundalas are outcasts from other tribes in Central Ugala. 

It is the custom in this country to get rid of certain types of people. 

The hill people of Central Ugala are farmers. The upper hi^lands 
provide excellent soil for cultivation. The farmers of this country 
are peace loving, which is reflected in their art work. The outcasts 
of Central Ugala are all hill people. There are about fifteen dif- 
ferent tribes in this area. 

This paragraph contains enough information for a subject to infer that the 
Fundalas are peace-loving, even though this is not explicitly stated. 

Our survey of processes involved in understanding of text must take account 
of inferential processes as well, since what is learned from a text may include 
the outcomes of such reasoning processes. To atterapt to draw conclusions on 
the nature of inferential processes would, however, take us far beyond the 
scope of this svirvey. 



Chapter 3 

THE MEASURbENT OF COMPREHEHSIOIf AND lEARHING 



The Problem 

If the analysis in the previous chapter is correct, the act of compre- 
hending a sample of verbal material (a "message”) consists, at least initially, 
of deriving a "meaning" or "semantic interpretation" for it. Once the receiv- 
er of the message has derived this semantic interpretation, he may evaluate 
it for its "acceptability" to him (in terms, for exas^le, of truth, relevance, 
or conformity to expectation) , and if it is "acceptable" he may assimilate it 
to his cognitive structure, in which case we may say that he has "learned" 
the content of the message. In addition, he may derive further cognitive 
structure from the text on the basis of inferential processes, but because of 
the complexity of these processes, we shall give them little attention. 

Thus, we pose for ourselves two problems in this chapter: 

(1) How can any outside observer of the communication sequence determine 
whether comprehension has actually occurred? More specifically, how can an 
observer determine how much has been comprehended, and how accurately it has 
been comprehended? 

(2) How can an outside observer determine what an individual has "learned" 
as the result of his receiving a message? How can one determine how the indi- 
vidual's "cognitive structure" has changed? 

These two problems are very difficult. They 6u*e difficult to separate 
operationally , because any procedure for testing the degree to which an 
individual conq)rehends a message tends to involve operations that also test 
learning. Furthermore, both of these problems present an inherent difficulty 
that arises from the fact that the processes one is interested in measuring 
are internal and not directly observable; we can infer their nature only 



fron observations of overt behavior that accompanies the internal processes , 
either spontaneously or as the result of special arrangements that can be 
made, such as giving the individual prior instructions as to how he is to 

respond. 

It should he noted that our concern here is primarily with how we can 
measure comprehension or In a snecific instance where a verbal 

stimulus has been presented, as opposed to the measurement of comprehension 
or learning ability . An abiUty is a generalized property of the individual 
expressed in terms of the probability that he would comprehend or learn the 
meaning of any given message; one would infer an individual's ability from 
his performance in some systematic sangile of test situations in which messages 
are presented to him for comprehension or learning. The problems of measuring 
comprehension or learning In specific instances also apply to the measurement 
of comprehension or learning ability , but in ability measurement many of these 
problems can be circumvented by statistical averaging processes. For example, 
comprehension ability con be measured by presenting the individual with a 
series of sentences to evaluate for truth or falsity; even though the chance 

f 

Of getting any one sentence correct by "guessing" is .5, with a large enough 
sample of sentences one .^could nevertheless obtain a reliable measure. This 
procedure— of having sv/bjects evaluate sentences for truth or falsity— 
would be a highly xmreliable one, however, for indexing the comprehension of 

any one of the sentences* 

The main body of this chapter will be devoted to an examination of the 
various methoos that have been proposed for the measurement of comprehensi ^ 
(understanding^ as specified in Chapter II); it will end with some remarks 
on the measurement of learning . 
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Desidera’ta for Me asureme 11135 of Comprehension 



There are many kinds of procedures for measuring comprehension; we shall 
evaluate them with respect to the desiderata specified below. It would be 
comfortable to think that one could find one procedure that would meet all 
specifications 5 but apparently there is no such procedure. Procedures have 
to be selected and tailored to meet the requirements of given situations. 

(1) Validity . Ideally, a measure of comprehension should reflect solely 
c ompr eh e ns i on (the derivation of a correct syntactical and semantic interpre- 
tation) and not any other behavioral process such as memory, guessing, or 
the like. 

( 2 ) Reliability . Ideally, a measure of comprehension should be reliable 
in the sense that it gives consistent outcomes on equivalent trials for a 
given individual. Unfortunately, it is difficult to imagine that in this 
context there can be truly equivalent trials, because the individual is likely 
to be changed as a result of even one expostire of the stimulus. Perhaps for 
this reason, there have been few instances where the reliability of an outcome 
has been investigated. (However, reliabilities of tests of comprehension 

ability have been routinely reported.) 

^2) Generality . Ideally, a procedure for measuring comprehension should 

be applicable to (a) all types of verbal material, and (b) all classes of 
individuals. By "all types of verbal material," we have in mind variation in 
the quantity and complexity of the material — whether it be a single word, a 
single sentence, a paragraph, or a longer discourse and whether it be pictur- 
able or non-picturable , concrete or abstract, literary or scientific in subject 
matter, etc. By "all classes of individuals" we have in mind children, adults 
native vs. non-native speakers of the language, etc. 
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Thpqp aspects can be broken down into: 
(4) Convenience and pra cticality. These aspe.. 

(a) ease in preparing the measTorement device j 

(t) ease in adijinistering the procedure to the individuali and 
(c) ease in scoring or otherwise evaluating the outcomes 
valid and reliable way. .. 



MAJOR 



TYPES OF PROCEDURES FOR LffiASURIHG COMPREHENSION 



p.o cu.^oidve evaluations of c ogrehensi^. Prohahly the sis^lest and 
,st obvious procedure for determining comprehension is to ask the individual 
aether he comprehends. The validity of such a procedure clearly depends 
,on the honesty of the individual and his overall comprehension ability. Even 
r he is honest., he may report comprehension when he actually misperceives the 
waning of the stimulus. Nevertheless, he is imlikely to report lack of compre. 
ension when he actually conrirehends . Under certain circumstances , this method 
ay have considerable merit. Several specific procedures that have been in- 

estigated are as follows : 

1.1 d„bientive evaluations of co mp rehension, accom panied by a laten cjl 
leasure . Banks (1969b) presented his subjects with a senes of word strings 
■arying in grammaticality and semantic abnormality. Samples: 

.lures mease sick children (grammatical and meaningful)-. Families hapjg . 
,„,- ph-hors pleasant make (meaningful but not grammatical); Wise parties crea^ 
sarlv flowers (grammatical but not meaningful) ; Active reach strange capta ^ 
nnes (neither grammatical nor meaningful). The subject was asked to press 



utton as soon as he "understood" the string, and the latency of this response 
measured. The subjects were kept "honest" because they knew that every so 
en they might be asked to paraphrase the' meaning they had apprehended. The 
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valldity of the procedure is upheld by the fact that the latencies showed 
strong relationships to the meaningfulness of the sentences: the "non- 

meaninEful" sentences took much longer to "understand." {Grammaticalness , 
however, was not as well reflected "by the latencies.) 

This procedure could not, of course, he used very generally. Danks 
worked with intelligent university students, all of whom could doubtless under- 
stand without any difficulty the simple "meaningful" sentences that were in- 
cluded in the stimulus sets. It is doubtful that this method would give 
valid and reliable results in evaluating individuals’ comprehension of 
meaningful, normal text of a high level of difficulty, especially when the 
subjects are of limited education or verbal ability. On the other hand, this 
method somewhat resembles Kershner ' s (1964) method of testing comprehension 
by measuring reading time, on the assumption that the subject will complete 
his reading only when he thinks he understands the material. . 

1.2 Sub.iective evaluations of grammaticalness_. Maclay and Sleator (I960) , 
Coleman (l965b), Danks and Lewis (1970 ) , Quirk and Svartvik (1966), and Tikofsky 
and Reiff (196?) have had subjects evaluate sentences for "grammaticalnes s 
or "grammaticality." The sentences represent various degrees of deviance 
from normal English grammatical usage or patterning, and the evaluations have 
been made either by rating scale responses, ranking, or the like. It is found 
that in general subjects do indeed give ratings of grammaticalness in line with 
the degree to which the sentences conform to standard patterns, or are "well- 
formed" according to a grammar. It is beyond the scope of this review, 
however, to discuss the results in detail; the interest of this research is 
not in testing comprehension of sentences in response to grammatical patterns 
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but in testing the degree to which one can predict the ratings by various 
systems of formal grammatical rules, 'it Is debatable whether this procedure 
Is adequate even for the latter purpose. In that "acceptablUty in a com- 
municative sense may not correspond very well to "grammaticality in the 
sense of conformity to a given set of grammatical rules. In any case, the 
method does not yield valid measurements of comprehension since It Is 
addressed principally to grammaticality, which according to Banks' (1969b) 
results can be orthogonal to meaningfulness or comprehensibility. 

i.3 H„b^ectlve evaluations of comprehenslb _lllt^. Banks (1969b) presented 
a series of sentences varying In grammaticality and meaningfulness to univer- 
sity subjects, asking them to rate them for "comprehensibility," no explicit 
definition of comprehensibility being given. By statistical techniques, it 
was found that 9 % of the variability In the ratings could ie explained by 
three orthogonal factors; grammaticalness, meaningfulness, and overall 
comprehensibility. Note that an underlying comprehensibility factor was 
independent of grammaticalness and meaningfulness! Carroll (1966) obtained 
judges' ratings of the "Intelligibility" of sentences that were either human 
or machine translations of sentences from a Russian text; it was found that 
by pooling ratings of several judges, highly reliable meeisurements of Intelli- 
gibility could be obtained, and that these pooled ratings were hl^ly corre- 
lated with judgments of translation accuracy (and also, inversely, with reading 
times). While the judgments of comprehensibility obtained by Banks and by Carroll 
probably reflected the degree to which the Judges actually comprehended the 
sentences, there is no guarantee of this. The method Is focused on the 
potential "comprehensibility" of sentences rather than the actual degree to , 
which judges understand them; It Is of limited generality since it applies best 



61 



-57- 



in a situation where the verbal materials show wider variations in grammatical- 
ness and meaningfulness than are exhibited in ordinary utterances or texts. 
Schwartz, Sparkman, and Deese (1970 ) have used this technique for a wide 
variety of auditorily presented sentences and claim that it yields an index 
of comprehensibility that is "probably more sensitive and reliable than any 

word or sentence count readability index." 

l.U Evaluation of the truth or falsity of a statement_ . A time-honored 

procedure in various kinds of achievement tests is the so-called true-false 
item. Usually used in subject-matter achievement tests , it can also be used 
in tests designed to measure sheer language comprehension, particularly tests 
of foreign lariguage competence. Because of the unreliability of the outcome, 
which can be influenced by guessing, this procedure is not recommended for 
assessing comprehension of a single message; furthermore, it can be applied 
only to statements whose truth or falsity will be immediately apparent to 
the subject once he has comprehended it. Nevertheless, Wason (196l) used 
this method in an experiment on the effect of grammatical negation; he 
presented sentences such as "87 is not an even number," "24 is an odd number," 
etc. and measured the latency of the judgments, pooling resuilts over samples 
of such sentences. 

1 . 5 Evaluation of centrality or importance of ide as in a passage. A 
number of reading or listening comprehension tests have used the device of 
asking the subject to identify those parts of a connected passage that are 
more central, important, or relevant to its main theme (Knower, 1945; 

Husbands and Shores, 1950; Abrams, I 966 ). Although this device maybe useful 
in a test of comprehension ability, its validity and reliability for measixring 
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comprehension of the material is questionable, because it gets at ccmprehen 
Sion only indirectly and could easily yield false positive or false negative 
results. It would appear to be more valid in measuring ability to make 
inferences from text materials . 

1.6 Evaluation of importance of words in a sentence . Segal and Martin 
( 1966 ) had subjects rate the importance of words in each of a number of 
sentences, finding a tendency for grr-mmatical subjects to be rated higher 
than logical subjects regardless of the sentence transformation. The materials 
were all very easily comprehensible sentences. The procedure does not seem 
to be promising as a measure of comprehension; it was not designed for this 
purpose in any case, 

2 . 0 Asking ques t ions designed to test comprehension o f verbal materia l 
on which the questions are based . One finds on nearly all standardized reading 
or listening comprehension tests the device of presenting a paragraph to read 
or listen to and then immediately asking a series of questions covering the 
content of the paragraph. (Ordinarily, on reading tests this paragraph is 
available to the subject as he answers questions. In listening tests the 
subject has to depend on immediate memory.) This procedure is used, for example, 
in the McCall-Crabbs Standard Test Lessons in Reading , Gates Re ading, Tests, the 
Metropolitan Reading Tests, the Stanford Reading Tests_ , the Brown-Carlsen Listen^ 
ing Comprehension Tests , and many others. Since the object is to measure compre- 
hension ability the selection of items is controlled by statistics concerning 
whether correct answers to a given item are correlated with generally high 
scores on the complete tests, or with some external criterion such as 
scholastic success. The precaution of insuring that the items cannot be 
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answered except at a chance level by an Individual who has not read the para- 
graphs is not always taken. It is probably partly for this reason that the 
scores on these tests are ciuite hi^ly correlated with measures of general 
verbal ability. Thus we can conclude that these are not pure teats of the 
comprehension of the particular paragraphs presented; they may also be tests 
of the ability to answer questions . Indeed, this type of test is often an 
integral part of "intelligence" tests such as the Scholastic Aptitude Tests 
sponsored by the College Entrance Examination Board. 

The questions posed on such teats are ordinarily of the "objective" 
type— true-false, multiple-choice, or matching, but sometimes they are essay 
or "free-response" items. These item types vary in reliability and validity 
but they tend to give highly correlated results (Serling, 1967)- 

Tests of this type have often been used in various kinds of experimental 
studies on factors affecting reading or listening comprehension (e.g. , Moore, 
1919; English, Welborn, and Killian, 1931.; Jenklnson , 1957; Coleman, 1961.a; 

Jakobovits, 1965; Lee, 1965; and Dawes, 1966). 

It has been claimed by some that depending upon the content and construc- 
tion of the question, different kinds of reading or listening "skills" can be 
measured. Davis (19l.lt), for example, claimed to be able to distinguish a 
number of separate skills such as ability to remember details , ability to 
make inferences, etc., but Thurstone (191.6) demonstrated that Davis’s data 
were well accounted for by a single dimension of reading comprehension abili- 
ty. In a careful, recent study, Davis (1968) was able to show small but 
significant amounts of unique variance In tests designed to measure such skills 
as "recalling word meanings," "drawing Inferences from content," and "following 
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the structure of a passage." ( Inspection 'of the items for "following struc- 
ture" shows that they are essentially measures of ability to use syntactic 
and grarnmat i cal— antecedent cues.) It may then he that particular test ques- 
tions can identify different aspects of comprehension. Such a conclusion is 
supported hy the work of Bateman, Frandsen and Dedmon (1964) who showed in 
a factor analysis of the Brown-Carlsen listening test that some items measured 
memory for details, while others measured the ability to draw inferences. But 
memory for details and ability to draw inferences are not really aspects of 
comprehension: memory for details is a function of attentional processes and 

of time lags between exposure to the material and the time of testing; the 
ability to draw inferences is logically distinct from sheer comprehension. In 
any case. Derrick (1953) was unable to find any clear separations among (a) 
the ability to answer factual questions, (b) the ability to "read-between-the- 
lines," and (c) the ability to make critical judgments. Nor was Derrick able 
to find that it made any difference whether the passages on which the questions 

were based were short or long. 

If one is going to use questions to determine the degree of comprehension 
attained by reading or listening to verbal material, it is absolutely essen- 
tial to insure that the questions cannot be answered (except at a chance level) 
by individuals who have not read or listened to the material that is to be 
presented. For some purposes, it may also be desirable to assure oneself that 
the content of the material is probably -anfamiliar to the members of the 
group tested. Weaver and Bickley (196 Tc) point out that it is often the case 
that 
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reading comprehension tests are highly dependent- on examinee 
characteristics which often have little to do with the reading task 
assin.es he is presenting, heading tests 
nast learning, word association, irrelevance of distractors , and 
^item conceptual-information constraints,' as well as the person s 
aMlU^o Lswer mltiple-choice items directly from cues in the 
reading: display. The sources of variation are so confotinde a 
tZ or more, factors could he hidden here, and one would never know, 
Much of the confounding could he reduced hy changes in methods of . 
selecting items . 



This remark applies equally well to methods of construct ir^ items. What 
is needed is a design in which the questions are pre-tested on groups that 
ha.ve not been exposed to either the general or specific content of the material 
to he presented; questions that are equally likely to he answered correctly 
hy both none^qjosed and exposed groups are either rejected or changed until 
there are clear differences in the responses of the two groups . Such pro- 
cedures have been used hy a few careful investigators (Beighley, 1952, 1954, 
Fairbanks, Guttman, and Miron, 1957a). Marks and Noll (196?) present a tech- 
nique that is to he highly recommended for evaluating items on reading and 
listening tests. By using the controls that they suggest, one can he reason- 
ably certain that responses to comprehension items validly measure the 
degree to which the subject has been able to acquire^ new knowledge through 
exposure to verbal material. Use of this technique will also tend to 
control for the fact that some pupils have as much difficulty understanding 



the questions 



as they have in understanding the material on which the ques 



tions are based (cf. Piekarz , 1954). 

Bormuth (l9T0b) has pointed out that achievement test questions can 
frequently be analyzed as granmiatical transformations of material in the text 
He urges that such items are easy to construct when viewed in this way and 
likely to be valid in measLU-ing pure comprehension as opposed to inference. 
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3.0 Following verbal directions . Tests of the subject's ability to 
follow directions have appeared in intelligence tests (e.g., the well-known 
Army Alpha) but have rarely been used in experimental studies of comprehen- 
sion, despite the fact that such tests could in many circumstances be highly 
valid measurements. In a realistic classroom experiment, Brown (1955) studied 
students' ability to listen to instruction concerning the spelling rules for 
doubling consonant letters before the suffix -ing. and then tested for compre- 
hension by having them spelT a number of words ending in Jones (1966) 

investigated the effect of the negative qualifier except by having children 
perform a cancellation task under either of two instructions: "Mark the 
numbers 1, 3, h, 6, 7" and "Mark all the numbers except 2, 5, 8." These two 
instructions were logically equivalent, since only the digits 1 through 8 
were presented. Shipley, Smith, and Gleitman (196?) tested young (l l/2 - 
2 \l2. years of age) children's ability to respond to commands concerning 
pointing to objects and found that they failed to respond to commands con- 
taining nonsense words even when relevant meaningful words were retained in 
the command. Coleman (in press) has reported a series of studies on grammati- 
cal factors determining the length of time a child needs to read a printed 
instruction in order to be ready to perform an arithmetical task (e.g., 
"Subtract two from the mean of the rows ) ; the child then performs the task 

to show comprehension. He recommends the following-directions procedure as 
} 

one of the simplest and most valid methods for measuring comprehension. 

With Coleman's recommendation we can agree, with the following reserva- 
tions, however: (l) as with a number of other procedures, one must assure 

oneself that the criterion task cannot be performed unless the subject has 
been exposed to the instruction; (2) this procedure may be applicable only 
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in connection with a relatively limited range of verbal materials; (3) it 
may be difficult to exclude problems of memory and various performance factors 
the individual may comprehend the instructions but forget them, or become 
confused, when he actually performs the task. 

U.O Measurements taken during reading . Various oral reading tests 
(Gates, 1953; Gilmore, 1951) illustrate procedures in which the comprehension 
of a paragraph is measured in terms of the child's ability to read it aloud 
without hesitations, mispronunciations, and the like. However, this technique 
seems to get at mainly the ability to decode print and is thus beyond the 

scope of this review. 

On the assumption bhat an individual will attend to a reading selection 
only as long as he needs to gain the information it contains, a measurement 
of silent reading time may give an indirect indication c.f comprehension. We 
have already seen an application of this idea in the work of Banks (1969b), who 
measured the latency of a button press used by the subject to indicate com- 
prehension of a simple visually-presented sentence. This idea has also been 
used by Weaver and Garrison (1966), who found significant differences in 
reading times for sentences as a function of the position of prepositions. 
Nevertheless, a subject will spend more or less time reading depending upon 
whether he expects to be tested. Kershner (196M and Rothkopf (1968a) found 
that with repeated exposures to textual material college students took 
decreased time to read the material and at the same time made increasingly 
better scores on a "cloze" test of comprehension (see 9.1)* Thus, reading 
time during the first exposure is not necessarily a valid indication of 
comprehension or of information gained during that exposure . Reading time 
can be used as a measure of comprehension only in special circumstances. 







The same can be said for the eye-voice span, i.e. , the amount of addi- 
tional material that an individual reading aloud can report after illumina- 
tion for reading is terminated. The technique has been used by Schlesinger 
(1966b), Levin and Kaplan (I966) , and Levin and Turner (I966) to investigate 
the role of grammatical structure in the perception and comprehension of 
textual material; Schlesinger concludes, for example, that the eye-voice span 
typically reaches "to the end of either a syntactic constituent or of a 
'chain,' which was defined as a group of words that the reader in his left-to- 
right perusal of the sentence might take to be a constituent" (p. 33). 

■ Edfeldt (i960) has shown that experienced readers do not make subvocal 
movements (detectable by electronygraphic techniques) when reading easy 
material, but these movements become detectable when the material becomes 
difficult. Electromygraphic techniques, then, might be used to index the 
difficulty an individual has in understanding material he reads, but they 
would not provide a direct measure of con^jrehension , and might be affected 
by a number of other variables besides comprehension. Hardyck, Petrinovich, 
and Ellsworth (I966) report a technique for suppressing subvocal movements. 

Patterns of eye movements are so variable within and among individuals 
that they show very little dependence upon the difficulty of material 
(Anderson and Dearborn, 1952 , PP . 128 ff.) and are therefore generally unreliable 
as indicators of comprehension. As reported by Miller and Isard (I96U, fn. p. 
299), however, Mackworth and Br\mer were able to use eye-movements to index the 
difficulty of sentences. Highly self-embedded sentences were generally read 
with more fixation units than sentences not so embedded. 
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5.0 Verbatim recall . The study of recall is one of .the best-developed 
areas in experimental psychology, but a great deal of the work has concerned 
the recall of relatively simple stimulus displays such as lists of nonsense 
syllables. The study of the recall of connected verbal discourse has received 
major attention only in recent years. We must consider what if any connection 
this has to do with the measurement of comprehension of verbal material. 
Logically, there is no necessary connection. One could, for example, compre 
hend a text and then immediately forget it. On the other hand, one might 
have perfect recall for a string of unconnected, incomprehensible words in a 
foreign language. The connections between recall and comprehension must be 
tenuous, or at least complex. In this section we will examine simply the 
techniques that have been used for the study of recall, with some preliminary 
comments on the extent to which these techniques yield valid evidence concern- 
ing comprehension. 

5.1 Verbatim recall immediately after presentation.. When the material 
is of very short duration, the subject can recall verbatim as a function of 
what is cali.ed memory span or short-term memory. Surprisingly., there is 
little direct evidence as to exactly what the memory span for verbal material 
(e.g., unrelated words) is; Miller (1956) reports data from Hayes to indicate 
that this memory span is above 5 (at least for monosyllables). As soon as 
there is any degree of semantic or syntactical organization in a series of 
words presented for immediate recall, the number of words that can be recalled 
correctly increases beyond the normal span (Marks and Jack, 1952). This is not 
to say, however, that short-term memory factors cease to operate. 

Since memory span for young children is normally less than seven, even a 
graimnatical sentence of seven words can tap the linguistic competence of a 
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young child-, Binet's developmental scale of 19H as cited by Terman (1916, 

pp. 37 - 39 ) included the following items: 

Age 3: Repeats a (spoken) sentence of six syllables. 

5 . Repeats a (spoken) sentence of ten syllables. 

Age 15: Repeats a (spoken) sentence of twenty-six syllables, 

me child passed the test only if reproduction was perfect. Terman used 

similar tests in his 1916 Stanford-Binet scale, but they no longer appear 
in the latest. I960, revision (Terman and Merrill. I960). However, tests for 
Repeating Thought of a Passage appear at the Superior Adult II and III levels ; 
here, verbatim recall is not required, but the subject must give, in proper 
sequence, accurate reproductions of the "component ideas. 

The experimental study of verbatim reproduction of longer passages 
(Henderson, 1903; Lyon, 191T ; Clark, 191.0) has generally depended on a scoring 
procedure known as the "method of retained members." The stimulus passage is 
divided into a number of phrasal units of approximately equal size; the sub- 
ject's response is then scored in terms of the number of these units that 
are accurately reproduced. Sometimes partial credit is given for repro- 
duction of the thought of a unit when it is not verbatim. Levitt (1956) 
showed that different investigators are likely to make different divisions 
of a passage and these differences are likely to be reflected in recall scores. 

indeed, the major difficulty with the study of recalls of connected 
discourse seems to be that of scoring. King (i960. 1961). King and Russell 
(1966). and King and Yu (1962) have reported a series of studies showing that 
when judges are asked to scale written recalls for excellence, two factors 
influence their Judgments ; a "quantitative" factor having to do with the 
gount of recall (n»ber of words, and the like), and an "organization" factor 
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having to do with the quaJ-ity and organization of the senmtic content. This 
would mean, incidentally, that some judges are more influenced by quantity, 

others by organization. 

One of the most perceptive studies of verbatim recall was by Gomulicki 
(1956), who presented his subjects with 37 prose passages, from 13 to 95 words 
in length. He ptudied the reproduction of each word, judging it as either 
-adequate" or "inadequate." Over the whole set of reproductions, 55-5^ words 
were reproduced verbatim, 32.7^ were omitted, 11.8^ were changed, and 6.2f. 
were added words or ideas. The frequency with which a given element was 
"adequately" represented was regarded as a measure of its "mnemic value." 

Mnemic value was then studied as a function of semantic content (action vs. 
description) and grammatical function. Recall was regarded as an "abstractive 
procefis" because the best remembered materials described actor-action-effect 
sequences; there was even a tendency for Ss to turn descriptive passages into 

"quasi-narratives . " 

Immediate verbatim recall of verbal materials has been used to study many 

aspects of language behavior and learning. 

Basic processes in recall : Bartlett (1932), Paul (1959) j "to give only a 

few examples. 

The effect of organization (order of approxi mation to English): Miller and 

Selfridge (1950), Deese and Kaufman (1957)? Sharp (1958), Herrmann (1962), 
Tulving and Patkau (1962), Slamecka (196I.), Knox and Wolf (1965), Cohen and 

Johansson (19^7) • 

The eff ect of s y ntax and other grammatical factors : Miller (1962b), 

Martin and Roberts (1966), Robins (1968), Slobin and Welsh' (1968). 
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The ef feot of various instructions as to vhat is to be reca lled: 

Schwartz and Lippman (1962), King and Russell (1966) . 

The effect of associational factors : Rosenberg (1968e). 

Method of reproduction : .Clark (19^+0), Horowitz and Berkowitz (1967), King 
{1968c).' 

Oral vs . printed stimuli ; King and Madill (1968). 

These and other studie.s will be reviewed under appropriate headings later 
in this monograph. 

^ Verbatim recall after a set of materials has been pr esented. A 
minor variation of the procedure presented in section 5*1 has been used in a 
number of experiments on the effect of syntactical factors in recall (e.g., 
Marks and Miller, 1964). A set of word-strings are presented to the subject 
in sequence; he is then asked to write them down in any order as accurately 
as possible. Actually, Marks and Miller carried out this procedure for five 
trials to trace learning over trials. Since lee.-.'ning occurred even for 
normal sentences it is evident that the procedure tests recall much more 
than comprehension; because of the simplicity of the normal sentences (e.g., 
"Rapid flashes augur violent storms") there is little doubt that they were 
comprehended on first presentation. 

5,12 Prompted verbatim recall after a sot of material has hegn presentej. . 
k further minor variation is to use the procedure in (5 *11) tiut with "prompts." 
Mehler (1963), for example, gave Ss a set of eight sentences vaiying in gram- 
matical transformation; after tsach trial, Ss were given prompts consisting of 
nouns in either the subject or predicate position. 
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5 . 13 Verbatim recall after a time period in which interfering stimuli 
have been presented. ! When the verbal material is extremely simple, it may 
be desirable to test recall by interposing distracting stimuli between the 
time of presentation and the time of recall. Wilson (1966) had children 
read either single words, 3-word syntactic strings, or 3-word non-syntactic 
strings, after which they were required to read ordinary text for 15 seconds 

before giving their recall of the stimulus. 

Savin and Perch onock (1965) introduced a technique whereby the amoimt of 
grammatical material encoded in memory was claimed to be measured by the 
amount of additional material that could be remembered at the same time . A 
sentence was presented, followed by a string of eight unrelated words; the 
subject was to recall the sentence and then as many as possible of the eight 
additional words. However, Epstein (1969) has raised the question of whether 
Savin and Perchonock’s results might equally well be explained in terms of 
difficulty in retrieval processes . 

5.2 Delayed verbatim recall . Data on the accuracy of delayed verbatim 
recall of a prose passage presented only once are scarce. 

In one of Slamecka’s (1959) experiments on retention of connected dis- 
course, subjects had a mean score of 12.8 (out of a possible 28) for immediate 
recall of a 28-word passage after one presentation; after a period in which 
they had to learn another, unrelated passage, their mean recall was only 7.1. 
This gives no indication of what their recall would have been if they had had 
no original recall and no interpolated learning. Common experience would indi- 
cate that verbatim recall of verbal materials after one presentation is not 
very good even immediately after the presentation, and decreases rapidly with 
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.i„e, espeoially when the interpolated interval is filled with activities that 

:end to interfere with original learning. 

5.3 ..rnrnt of time to ;.emorise, with m iinte ^ted opportnnlty .„lor 

inspection , 'fhe amount of time to memorise verbal material depends 
upon the complexity of the material. This can he shown either hy giving the 
individual a set amount of time to study and measuring the amount of recall, 
or by determining the »ount of time the individual needs until he can repro- 
duce the material to some given criterion of accuracy. Rubenstein and Aborn 
(1958), using the former procedure, showed that for 30 20 0 -word passages 
culled from a wide variety of sources, the average learning score attained 
by a group Of subjects was highly correlated with two readability Indices 
applied to the passages and also with a "predictability" score (see section 
9.2). Using the latter procedure, Follettie and Wesemann ( 196 T) showed 
that learning time was related to v:arious characteristics of prose passages 
(principally, their length in terms of grammatical units). 

5 . 1 , m.reated study- t est learning trials, one stimulus at a ti me. In 
this procedure, the subject is repeatedly given learning trials consisting of 
a presentation phase (usually of constant duration) and a test phase (also 
usually of constant duration) in which the subject attempts to reproduce the 
stimulus either orally or in written form. The same stimulus is presented 
over the number of trials. The number of trials may be constant, in which case 
the learning score is the number of words recalled, and/or the number of 
errors (Sharp, 1998., Tulving and Pathau, 198.. Miller and Isard, 196b. Martin and 
Roberts, 1966 . Rosenberg, 1968a), or it may depend on the performance of the 
subject in attaining a criterion of perfect reproduction, in which case the 
learning score is the number of trials to criterion (Epstein, 1961. 1962 -, 
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Coleman, 1965 b-, Bogartz and Arlinsky, I966). In this type of study, an 
improvement in mean performance from an initially rather low level is uni- 
versally noted. The design does not permit any appraisal of the extent 
to which the stimulus is understood on any of the presentations since 
measurements are concerned solely with the subject's success in retrieving 
the memory of the stimulus, i.e., in constructing the response correctly. 

5. 4 l Repeated study-test learning trials, with sets of stimuli an _d 
free order of recall . This procedure is similar to ( 5 *^) "hut a set of 
unrelated stimuli are given in the presentation phase; in the test phase 
S is allowed to recall these, as accurately as possible, but in any order he 
pleases. The effect of this procedure is to introduce (a) a certain amount 
of delay between presentation of the stimulus and the test, and (b) inter- 
ference among the several stimuli in a stimulus set. These factors make 
the subject's retrieval task more difficult; they probably have little or 
no effect upon comprehension of the stimulus. A study illustrating the 

procedure is that by Martin and Roberts (1967) • 

5.5 Paired- associate learning . This classical procedure can be re- 
garded as a method of prompted recall; it is particularly appropriate for 
studying the effects of relations between the "stimulus" and "response" 
members of pairs, or of relations among the several stimuli or responses 
in the set. There are two main varieties of the procedure. One is the 
"anticipation" method, in which a trial consists of the successive presen- 
tation of the paired stimuli (the "stimulus" member of each pair being 
presented before the "response" member); with succeeding trials, S_ is re- 
quired to try to "anticipate" (say aloud) the response member of each 
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pair before it is actually presented. Illustrations of studies using this 
method are those by Martin and Jones (1965) and Martin, Davidson, and 
Williajns (1965). The other method, illustrated by studies by Rohwer, Shuell, 
and Levin (1966) and Rohwer, Lynch, Levin, and Suzuki (196?) , is the "study- 
test" method in which a list of pairs is presented to the subject for study 
for a specified amount of time, after which he is presented with the stimulus 
terms and asked to give the response terms . 

5.6 Serial learning . In the usual verbatim recall experiment, a 
passage is presented to to read or hear as a whole. Epstein (1962) 
wondered whether the organizational factors that facilitate recall of 
such materials as compared with unstructured materials would also facili- 
tate learning when the materials are presented word by word in the convention- 
al serial learning paradigm. The serial learning procedure consists of a 
series of trials j in each trial, the material is presented word by word 
(e.g., by memory drum), and with succeeding trials S is expected to learn 

to anticipate the successive words before they actually appear. Epstein found 
that sentences are no more readily learned in serial order than the same words 
in random order. Apparently the serial presentation prevents the subject 
from readily apprehending any syntactical structure in the material, while 
whole presentation does not. However, Epstein did not inform his serial- 
presentation subjects to look for structure. 

5.7 Recall by paraphrasing or giving essential ideas . To ask the 
subject to give back the substance of a sample of verbal material in his 
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own words" would seem to be a rather valid way of testing his comprehension. 
Yet, this method has been very rarely used in experimental studies of 
comprehension. There are at least three major difficulties with the pro- 
cedure, at least if a strict paraphrase is required: (1) telling the subject 

to use his "own words" may place an extra burden on him when he can remember 
some of the words verbatim; (2) it is difficult to score paraphrases for 
content conformity to the original, as Downey and Hakes (in press) found; 
and (3) the procedure dees not exclude the possibility that the subject 
may have difficulty in retrieving information even though it has been 
understood during original presentation. Clark (19^0) found that even 
whenSs were asked to give verbatim reproductions, successive reproductions 
improved in quality even though the subject had no opportunity to re- 
inspect the original. Clark's experiment suggests strongly that retrieval 
factors are involved in any recall, but it also suggests that the validity 
of a recall test (whether it is to be verbatim or a paraphrase) could be 
increased by allowing the subject to make several successive attempts at 

reproduction . 

Jones and English (1926) found that even after one reading of a 91-word 
passage, Ss were able to give an average of of the 31 ’’ideas" regarded 
as contained in it. A similar procedure was used by Gofer (19^+1). In 
neither of these studies were the Ss instructed to avoid using the same 
phraseology as the original. They found, as might be expected, that recall 

of ideas was much easier than verbatim learning. 

5.8 ThP "nrobe-latencv technique ." This technique was developed by 
Sucl, Ammon, and Gamlln (196T) for investigating the role of phrase structure 
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in the apprehension of language. A subject is given a sample of verbal 
material, such as a sentence. This presentation is immediately followed by 
the presentation of one word selected from the sentence; the subject is 
required to think back to that word and give the word that followed it. 

The latency (time in seconds) of this response is measured. According to 
these authors, as well as Ammon ( 1968 ), the method gives results in line 
with certain expectations regarding phrase structure. While comprehension 
might facilitate performance of this task, the technique is not likely to 

be a sensitive measure of comprehension. 

In sections 5.0 to 5.8, we have reviewed all the techniques utilizing 

recall and found them wanting in their ability to measure comprehension. 

From results on recall tests, it is generally difficult to tell to what 
extent any of at least three factors may be operating: (D understanding 

of the material at the time of original presentation, ( 2 ) "storage 
processes acting during original presentation to set the stage for recall, 
affecting either the semantic content of the material or the particular 
words used to express it, and ( 3 ) "retrieval" processes during the process 
of recall. In view of this, we recommend great caution in interpreting 
the results of recall tests -as indications of comprehension. 

6,0 Giving a translation of verbal m aterial, with opportunity for 
.nntnnual insuection . A traktional way of determining whether an individual 
understands material in a foreign language is to ask him to translate it 
into his native language. One may also suggest that a way of determining 
whether an individual understands materials in his native language is to 
ask him to translate it into some foreign language that he knows. Such a 
method has rarely been used in studies of comprehension as such, however. 
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for the obvious reason that subjects are rarely eiipeoted to be sufficiently 
competent in a foreign language to perform the task. Eie method has con- 
siderable appeal because it offers the possibility of ruling out recall 
factors. Nevertheless, there would be difficulty in scoring translations, 
particularly in view of the fact that there are only rarely one-for-one 

translation equivalents between two languages. 

f.O Tachninues depen,^inv on recomitlon . A traditional method of 
measuring learning and memory has been the recognition technique, whereby 
the subject who has learned something is then presented with some of the 
old stimuli together with some new stimuli and asked to indicate which are 
old and which are new. Seme of the questioning techniques described under 
2.0 depend upon recognition; at least, this is true of true-false questions 
and certain kinds of multiple-choice questions when they present material 
either unchanged or sll^tly modified from the original stimulus material 
and ask the subject, in affect, to indicate whether he recognizes the 
original stimulus material. Shepard (196T) has shown that collage-age 
subjects are remarkably efficient in distinguishing new material from old 
material even when the old material is of considerable extent. For example, 
Ss were 89? accurate In Identifying sentences they had Inspected in a list 
of 612 clearly different sentences. All the sentences were, however, very 
simple to understand (a.g., "A dead dog Is no use for hunting ducks."), so 
that one cannot say that the test was one of comprehension . 
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Uevertheless , the recognition technique has been used by several in- 
vestigators to examine detailed processes of comprehension. Clifton, Rurcz, 
and Jenkins (1965), and Clifton and Odom (1966) used a recognition task to 
index the grammatical similarity of sentences; after presentation of a 
series of sentences, these same sentences together -with sli^t grammatical 
transformations of them (negative, passive, question) were presented and 
the subject was asked to press a telegraph key whenever he thought he recog- 
nized one of the "old" sentences. The patterns of errors were found to 
correspond to some degree with the similarity of the sentences in terms of 
transformational distance, lending support to the "coding" hypothesis 
whereby sentences are stored in memory in terms of (a) their base forms, and 

(b) the transformations applied to them. 

Lee (1965) , Fiilenbaum (1966) , Newman and Saltz (i960), and Sachs (196 Ts» 
1967b) have used the recognition task to find out the extent to which subjects 
remember the verbatim form of words or sentences versus their meanin^s_. The 
evidence indicates, in general, that verbatim forms are remembered only for a 
relatively short time, whereas meanings are remembered much longer. All the 
materials used by these investigators were readily understandable in the 
original form (except possibly the longer paragraphs used by Lee). Thus, in 
these investigations the, recognition task cannot be regarded as a test of 
comprehension. If the original materials were of greater difficui.ty , however, 
the recognition task might offer a useful measuring technique, inasmuch as 
sheer memory for meanings has been shown to be fairly long-lasting. 

The "chunking" technique recently employed by Carver (19708') can in 
fact be regarded as an application of the recognition task for materials 
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that are relatively difficult to understand. Carver’s technique is to 
present a passage for reading, typically four or five paragraphs long. 

This is then immediately followed by a multiple-choice test. In each item 
of the multiple-choice test, each alternative consists of a "chunk" of the 
original— a clause, a phrase, or sometimes even a single word; one "chunk," 
however, is changed in meaning by the substitution of a different word or 
phrase. The subject has to indicate which alternative does not convey the 

original meaning. 

An example will illustrate the technique. The first paragraph of one 

of Carver's selections is as follows: 

Voter apathy is almost a cliche in discussions of American 
politics. Yet, only a cursory look at voting and registration 
restrictions shows that many would-be voters do not cast ballots 
because they are prevented from doing so. 

The test items covering this part of the selection are as follows: 



1. (a) Voter apathy 

(B) is almost a cliche 

(C) in discussions 

(d) of American politics. 

(E) a recent poll directed 

2. (a) at voting 

(B) and registration restrictions 

(C) shows that 

(d) many wo\ild-be voters 

(E) seldom protest or demonstrate 

3. (a) because they are prevented 

(B) from doing so. , . . o 

(C) ") [The remaining alternatives cover the beginning o 

(d) peiragraph in the selection.] 

(E)J 



The changed alternatives are constructed and item-analyzed In such a vray 
that individuals vho have not read the original passage are unahle to score 
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much above chance. The technique seems to have considerable promise, al- 
though it must be noted that the standardisation and validation of the 

multiple-choice items is a fairly complicated process. 

8.0 m..r.>.r,im,es in vhich compr ehens ion is tested by requiring verific a - 
Pictured referent . If a sentence is presented and the sbbject 
is ashed either to tell whether a picture accurately represents its meaning 
or to Choose one of several pictures that best represents its meaning, this 
would appear to have rather high validity in testing comprehension, apart 
from problems involved in guessing among the alternatives. The technique 
has been successfully Used in a nu*er of foreign language comprehension 
tests, and it is occasionally used in tests of listening or reading compre- 
hension. particularly those for young children. An assemblage of such 
It^s constitutes a fairly valid and reliable test of comprehension ability. 
The technique does have several advantages; (l) it is "fa=e valid, to the 
extent that the subject's ability to choose the correct picture reflects 
his actual comprehension of the message; (2) it is only minimally affected 
hy differences in the subject's ability to read printed alternatives 
(this is particularly advantageous in the case of listening tests, but also 
applies in the case of reading tests); (3) alternative choices can be de- 
signed in such a way as to trap the subject who has only partial compre- 
hension. Disadvantages of the technique are: (l) it is usually affected 
hy a guessing component which mahes it unreliable for testing comprehension 
of single sentences; (2) it is often inconvenient and difficult to prepare 
appropriate pictures; (3) it is limited to sentences or text materials that 
lend themselves td pictures . and even so. many concepts (e.g. . tense rela- 
tionships) are hard to represent by pictures . except possibly by moving 
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pictures or by cartoon sequences; (^) it is practically impossible to pre- 
pare pictures that will discriminate all the lexical and grammatical 
material that the sentence may contain; and (5) the technique may depend on 
pictorial perception processes of unknown complexity. Nevertheless, with 
appropriate care, the technique is highly useful in many circumstances. 

In several cases, it has been used in experiments concerned with 
processes in sentence comprehension. Gough (1965, 1966) had subjects verify 
sentences against pictures, under two conditions: (l) the picture was 

presented coincident with the beginning of the final word of the auditorily 
presented sentence, or (2) the picture was presented three seconds after 
the termination of the sentence. Even when the picture was delayed, active 
sentences were verified faster than passive ones, and affirmative sentences 
faster than negatives, contrary to what one might expect if it is supposed 
that the hearer immediately decodes a complex sentence by transforming it into 
its underlying structure. Slobin (1966) has used a similar technique, finding 
that one of the primary determinants of whether passives are not as readily 
verified as actives is whether the action is "reversible” (e.g., both t h _ e __cat; 
chases the dog and the dog chases the ca t are possible) or "non-, reversible" 
(e.g., the girl waters the flowers is possible but the reverse is not). 

9 . 0 Techniques depending upon context and redundancy . One of the 
standard tools in mental testing is the "completion item," where the exami- 
nee has to fill in a missing element from the context that is given. As 
used on "intelligence" tests, the context is carefully selected so that 
only one response is acceptable— or at most a very limited number of them. 

The context in this case is often a definition or a sentence that describes 
some situation where only one particular word to be filled in "makes sense." 
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use has also been made of the opposite technique-! .e. Inserting or 
substituting in the text a vord or phrase that "spoils the sense" of the 
nessage. and ashlng'the exa^nee to identify it. Apparently this technique 
v,as first used In the Cte pt'an-Cook Speed of Reading Te ^ (1923); the 
examinee's speed of reading is indexed by hou rapidly he can uork through 
a passage or series of passages and find the extraneous Items. Such a pro- 
cedure has certain objections: It is not a normal form of reading ta.>k 

Since the unwanted it»s spoil the meaning and may be a distraction; and 
sometimes by adopting a certain appropriate strates:. the subject can 
Identify the Incorrect Items without really comprehending the passage. 

9,1 mas (standard) " dose" technique. Introduced, or as some vould 
have It, re-lntroduced by Taylor (1953) as a convenient and reliable measure 
of "readability" (a characteristic of text material) , the "close" procedure 
has also gained some acceptance as a measure of indij^' degree of compre- 
hension Of material (Taylor, 1957). The procedure Involves taking a passage 
of text material and deleting vords in It by some rule, e.g., every nth vord, 
every noun, or the like. Most frequently, n is set equal to five, when 
systematic deletions are made, but other values, up to n = 12, have been 
used. The pupil Is then presented witi. the passage and asked to try to guess 
the missing words. Usually the passage is presented in written form. In 
which case the missing words are indicated by blanks of a standard sise. 
Peisach (1965), Dickens and Williams (1961.), and Weaver and Kingston (1963) 
have demonstrated the feasibility of administering the close technique in 

4 -v, ocon-p. Ts recorded on tape and specified words are 
an auditory mode: the passage is recoraea o 

• 1 cnfynn.1 (e R , a white noise) plus time for recording 
replaced hy some special signal {e.g.» 
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answers , or the test is administered orally by a teacher who tells the pupils 

to guess a word whenever she claps her hands. 

Various types of scoring procedtires are employed. Usually, the score 
is based on the number of words in the original that the subject is able to 
guess exactly (aside from insignificant number/tense changes or spelling 
errors). Such a score has the advantage of being objective, and it has been 
found to correlate hi^ly with other types of scores , such as those where 
words of similar meaning, or of similar grammatical function, are allowed 
as "correct" responses. However, the type of score that is most advantageous 
may depend upon the purpose of the cloze test. For purposes of measuring 
"readeibility" or "listenability," where the average score for a passage is 
obtained from a considerable number of readers (say , 25 ) , the score based 
on exact word replacements may be very satisfactory. Likewise, for measur- 
ing general comprehension ability , where the individual's score is based 
on a la^ge number of items and passages, the strict scoring criterion is 
most convenient and probably as valid as other scores . But for measuring 
an individual's comprehension of a particular passage, the more relaxed 
types of scoring may be more satisfactory. There has not been enough 
research on methods of scoring for an individual's comprehension of a 
passage . 

In most applications, the cloze procediire involves presenting the 
doctored passage "cold"; that is, the subject is not given advance opportunity 
to read the passage in its unmutilated form. He is supposed to guess words 
on the basis of the context or redundancy in the passage. His success or 
failvire in doing so is partly a fvinction of the inherent difficulty of the 
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.assage (including the Inherent difficulties of guessing the deleted words) 
u,a partly a function of his general oonprehensicn ability , which in turn 
.ay he a function of nany factors-his verbal intelligence, his Maturity, 
education, and experience, and perhaps, according to the results obtained 
hy Heaver and Kingston (1963), a special aptitude for utilising the re- 
dundancy in the passage. When the close scores are based on systeMatic 
deletions, a nunber of investigators (Taylor, 1957; Jenhinson, 1957; Greene, 
1965) have found Moderate to substantial correlations of close scores with 
various Measures of reading ability. However, Eanhin (1958) concluded that 
close tests in which the deletions are restricted to nouns and verbs are 
■hot very accurate" Measures of general reading shill. Weaver and Kingston 
reported that even though cloze scores nay have moderate correlations with 
certain measures of verbal intelligence, all eight of their cloze scores, 
obtained with various types of material and with both auditory and visual 
presentation, formed a factor-analytic cluster that they Identified as 
"redundancy utilization" ability. 

Thus, When the cloze procedure is used to measure comprehension of a 
passage in mutilated form, without prior exposure to the unmutilated form, 
the score cannot be a pure measure of comprehension. One would at least 
have to control for "redundancy utilization ability on a sample of pa 
cpd use that as a baseline for determining an individual's con^rehension of a 
particular passage. The complicated problems of equating involved in such 
measurements have not been adequately treated in research so far. By certain 
simple scaling techniques, Bormuth (1968a) found that if a pupil answered 
6^ of the words on a cloze test, it was^ equivalent to answering 75? of 
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the questions on a more standard -multiple-choice test of comprehension; his 
result was based, however, only on the paragraphs and questions in the Gray 
Oral Reading Test and may not be widely generalizable . Furthermore, this 
result was intended to be applied only to assessing the readability and 
grade-level suitability of instructional materials, not to assessing a 

particular child's reading comprehension. 

It has often been pointed out that the cloze technique measures a 
rather superficial kind of comprehension- the ability to follow the detailed 
ideas and grammatical patterns that occur within sentences or closely 
adjacent groups of sentences. There is no clear evidence that it will 
necessarily measure the ability to comprehend or learn the major ideas or 

concepts that run through a longer discourse. 

Numerous investigators have used cloze scores as a dependent variable 
in the comparison of groups with different treatment or selection conditions. 
In such investigations, it is possible that the confounding variables were 
washed out and the results with the cloze scores may be taken as valid. 

For example, Peisach's (1965) finding of social class and sex differences in 
5 th-grade children's ability to comprehend the speech of their teachers is 
probably sound. On the other hand, a question may be raised about Tatham’s 
(1967) finding of differences in comprehension depending upon whether high 
frequency" or "low frequency" language patterns were used, inasmuch as the 
cloze scores may have reflected nothing more than the "frequency" of the 
language patterns; the results would be of significance only if the cloze 
scores reflected comprehension of passages from the particular Ian- 

guage patterns used. 
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sometimes cloze scores are used to measure gain in knowledge , as when 
an vmmutllated passage is presented, followed by a "cloze" test on the same 
passage. Coleman and iiiller (1968) found that cloze scores based on system- 
atic (every 5th word) deletions were unsatisfactory for measuring knowledge 
gain, Since the scores were hardly hi^er, on the average, than scores made 
by individuals who had not seen the unrautilated passage. Greene (1965) 
reported the same to be true of cloze test scores based on noun and verb 
deletions. These findings are slightly at variance from those of Eankln 
(1958), who compared noun-verb deletion scores with systematic (every 12th 
word) deletion scores; the former he found to be "sufficiently accurate" 
for measuring specific gains in comprehension and knowledge, while the 
latter were not. Rothkopf (l968a) used content-word deletion cloze scores 
in showing that the proportion of correct responses was an Increasing but 
negatively accelerated function of the number of times a student was allowed 
to read a written passage. More research is needed on types of cloze scores 
that will show knowledge gains when subjects are allowed to Inspect an un- 
mutilated passage in advance of a cloze test, and/or on the conditions that 
detarmj,ne whether knowledge gains will be exhibited by such scores. 

In view of the grossness of cloze-procedure measures . it is somewhat 
remarkable that they have been so successful in many circumstances. Th 
success is achieved, in all probability, by the averaging of performance 
over many separate items. There are indications that a more detailed analysis 
of the responses, in cloze tests would be worthwhile. Jenkinson (195T) 
attempted to classify the kinds of clues that students use in performing 
cloze tests, also studying the kinds of errors made and what those errors 



indicated about sources of misunderstanding. A sioimnary of her classification 



of clues is as follows: 

I . Structure 

1. Syntactical , 

a) recognition of function words, parts of speech and word order. 

b) recognition of punctuation and accurate location of referents 

c) errors of word recognition 

2. Awareness of language 

a) sensitivity to sound (as in poetry) 

b) sensitivity to style — appreciation of exactness of expression, 
recognition of rhetorical devices and the style of the author 

II. Semantic 

a) identification of meanings of words, idioms, and groups of 
words in context 

b) identification of direct meanings of the whole passage 

2 . Contextual 

a) anticipation of ideas and meaning 

b) retrospection to check meaning 

c) extension and reconstruction of meaning 

3. Ideational j • 4 . 

a) fusion of separate meanings of words or groups of words into 

ideas 

b) recognition of the sequence and interrelationship of ideas 

c) recognition of implied meanings 

III. Approach 

1. Effort to obtain closure 

a) verbal closure 

b) negative 

c) tentative 

d) awareness of error 

e) verbal fluency and flexibility 

2. Use of experiential background 

a) general 

b) egocentric 

3. Intellectual 

a) imagining 

b) reasoning, analyzing, judging 

c ) problem solving 

More research needs to be done on the factors involved in guessing missing 
words. Rothkopf (1962) found that performance was best when deleted words 
were near the end of a sentence; this conforms to Forster's (1966) finding 
it is easier for a subject to provide an ending for a sentence already 
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. . -Pnv the ending of a sentence. Pike 

started the. to provide a tesrn^n. 

(1969) has made a detailed analysis of oer ai informa- 

tion should he of help in oonstraoting more valid dote es s. 

. ■ » 1 suggest the name "progressive 

9 2 Progressive cl ote^edaiaBg.- I ="88® 

- — . m,r -Pnv scaling the 

oioze" for a technique that has been used occasiona 

r ■ IS It is modeled after a procedure introduced by 

difficulty of materia . of English. Shannon had subjects 

(1QS1) for measuring the redundancy of Eng 

e letter b^ letter. That is. they vere told to guess 
try to guess a passage JL 

the first letter; the number of their guesses until 

n. . iptter etc. Ruhenstein and 

M0S8) had subjects try to guess a passage word bjr word • 

Aborn (1958) had su j .passage in terms 

,oss per word and manured the difficulty of the passage 
only one gu.ss per subjects. They 

n+n.ffe of words correctly guessed y S 
of the percentage „i,talned by this method 

Shoved that "predictability" scores for passages obta^^^^^^ 

were highly correlated vith readability and learning s 

of subjects. The technique has been used by others e.g., 

other groups of J Qc^aling learn- 

10^7 with Swedish) for scaling 
1 iQ^U- Cohen and Johansson, 198T , wiun 

’ poppa and >Iettler (1967) , working with German, found that 

higher for sentences with complicated syntax, 
predictability scores wer eharacteristics of 

however. Whether this was true becaus 

the German language is as yet not suitable for 

ooleman and Miller (1968) found that this technique w 

. 1 individual subjects. Essentially, their pro 

measuring information gam 

. +..nflls with the same passage . On 
oedure had the subject male two trials with 
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trial, he was asked to guess the passage word by word. He was allowed only 

one guess per word. According to Coleman and Miller: 

"If he guessed wrong, he was told the correct answer, and then he at- 
tempted the next word. The measure of what he knew about material he had 
not read was simply the number of correct guesses per hundred words. 

"As the subject proceeded through the passage guessing every word, he 
must have studied it most carefully. As soon as he finished, he went 
through the passage again, guessing each word. The difference in correct 
words on his first and second attempt is a measure of IG [information gain]." 

The mean percentage of words guessed on the first trial was 33.73; on 
the second trial, 72.66. The scores in the second trial correlated only 
.57 with the scores on the first trial. However, these results were based 
on only 9 subjects and there were no external criteria of validity. One 

can only say that the method shows promise. 

10.0 Construction and rearrangement tasj ^. As long ago as World War 
I, when the Army Alpha Intelligence Test was constructed, a favorite method 
of testing verbal intelligence has been to present a sentence with the words 
scrambled. In current terminology, such sentences exhibit a type of gram- 
matical anomaly. Until recently, little study has been made of the psycho- 
linguistic processes involved in performing the task of reconstructing the 
sentence. Clearly, there are individual differences in ability to perform 
the task. Ole'ron (l96l) presented subjects with scrambled groups of (French) 
words; they were told that the words, when pirt into their original order, 
constituted news items in a telegraphic style. Subjects had increasing 
success in reconstructing the texts when the words were grouped by twos or 
threes in their original order. This method permitted study of the roles 
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played by grammatical factors and verbal associations. Similar work has 
been done by Sever (1968) with scrambled sentences in English. For certain 
types of materials, ability to reconstruct a scrambled passage would appear 
to be a good criterion of comprehension, but it points up the fact that 
subjects do not necessarily use simple syntactical (word order) elements 
in comprehension; rather, they use their knowledge of the syntactical and ■ 
semantic structures which particular lexical items are most likely to 
enter. Ordinarily, the reconstruction task has been applied to single 
sentences. Pfafflin (196?) found that Ss could re-order sentences that had 

been scrambled within a paragraph. 

CONCLUSIONS ON THE MEASUREMENT OF COMPREHENSION 

We have surveyed a wide variety of techniques that have been used by 
investigators to study language comprehension and the factors involved in 
it. It is evident that no one technique is universally valid for measuring 
comprehension; each technique has its own particular sphere of appropriate- 
ness. A number of distinct purposes can be discerned in the investigations 

surveyed: 

(1) Measuring the general comprehension abil ity of individuals; 

(2) Measuring the degree to which an individual comprehends a 
. jjj'rcicular sentence or passage; 

(3) Investigating the psycholinguistic processes in the comprehension 
of textual materials; 

(U) Measuring the "comprehensibility," "readability," "listenability , 
or "learnability" of samples of textual materials ; 
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(5) Measuring the "graimnaticality" or "communicative acceptability" 
of samples of textual materials . 

in general, any one of the techniques might he used for any of the 
above purposes, hut for each purpose there are certain "methods of choice." 

Measuring comprehensloh atlllty. Measurements of comprehension ability 
must be based upon a substantial sample of materials ranging widely in 
difficulty, in order to produce scores that are reliable and that accurately 
indicate the general level of difficulty that the subject is able to compre- 
hend. The "methods of choice" are mostly the traditional ones, such as 
multiple-choice items, but several newer or more unusual techniques may also be 
considered. In approximate order of general usefulness, these methods may 

be recommended: 

2.0 Asking questions designed to test comprehenaion--H2yileO>Lal 
the questions have been adequately pretested to exclude the 
possibility that they are either too easy (and can generally be 
answered without exposure to the material on which the items 
are based) or too hard (pose problems extraneous to that of 
comprehending materia.1 ) 

3.0 Following verbal directions 

8.0 Verification against picttired referents 

9.0 Techniques depending upon context and redundancy— 

(a) the standard cloze technique, with deletion of every nth 
word, where n may range from about 5 to about 12 

(b) Carver’s "chunked" comprehension test 

(c) Insertion or substitution of words to "spoil the meaning" 
l.U Evaluation of the truth or falsity of a statement 

1.5 Evaluation of the centrality or importance of ideas in a passage 
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10.0 Construction and rearrangement tests (generally applicable only 
for written tests) 

It vlll ^>e noted that tests of « or recall are not recommended for 

measuring comprehension ability. 

the coMprehenslo n^La^von^. Most of the techniques 

listed above for measuring comprehension ability are also appropriate here, 
except that even more attention has to be given to the pretest g 
materials. Hovever, one should probably exclude the techniques listed under 
9.0, "Techniques depending upon context and redundancy," since the measures 
yielded here are too unreliable to be useful for evaluating comprehension of 
a single text unless the text is fairly extensive. Also, some of the tech- 
niques may be inappropriate for a particular text, e.g. , one whose meanings 
are not readily plcturable, or one that does not lend itself to having the 
auhject follow verbal directions based on It. Again, tests of memory or 
verbatim recall are not recommended, except that, ashing the subject to give 
a free paraphrase of the text may have advantages in certain cases. The 
disadvantage of the paraphrasing tash is that it is hard to score accurately. 

r.„.„ti„tinm nsvcholinguiM °£g^- 

considered in this chapter can be of use in psycholinguistic Investigations 
Of discourse comprehension, and I will not attempt to discuss them in detail 
in this context. One caution may be mentioned, and that Is that tests of 
recall are very likely to be deceptive in that they fail to distinguish 
between comprehension at the time of initial presentation and ability to 

retrieve or reconstruct information at the time of recall. 

the comprehensibUl ty of tei^- M=tory of methodology in 
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measuring comprehensibility (readability, listenability ) seems to have been 
characterized by a progressiie substitution of one preferred technique for 
another. Originally, the "method of choice" was asking comprehension ques- 
tions (method 2.0), but this was replaced by various stylistic analysis 
counts when it was found that the latter could reasonably well predict the 
former. We have not discussed these techniques above because they are not 
direct measures of comprehension or comprehension ability ; they deal only 
with the characteristics of texts. More recently, however, the cloze 
technique in one or the other of its forms has tended to be the method of 
choice because of its simplicity (apart from the bother of administering and 
scoring cloze tests) and apparent validity. The cloze technique is currently 
the most favored technique, despite its unwieldiness. It may yet turn out, 
however, that subjective judgments of the sort used by Carroll (1966) or 
Schwartz, Sparkman, and Deese (19T0) may come to replace the cloze tech- 
nique as a method of choice. 

Assess insT grammaticalitv or accept abili^ . Strictly speaking, one cannot 
assess grammaticality except by grammatical analysis in terms of a particular 
grammatical theory. "Acceptability," however, can be assessed, but only, 
almost by definition, by subjective techniques. An extension of these 
subjective techniques occurs when subjects are asked to correct the 
grammar of a sentence, as did Quirk and Svartvik (1966) and Banks (1969b). 

THE MEASUREMENT OF LEARNING FROM DISCOURSE 

On the assumption that "learning from discourse" means "assimilation of 
meanings into a long-term memory store," the measurement of such learning 
must carefully distinguish between "comprehension at time of original 
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presentation" and "oomprehension after a delay." Just what period of time 
is referred to when we speak of "delay" must depend upon the oircumstances : 
we will review in Chapter Y what is known about the retention of verbal 
meanings after various delays. Various reoall, recognition, and reconstruc 
tion techniques are available for the measurement of retention. A sharp 
..Stinction has to be drawn between "rote" memory and "logical" memory, to 



use terms employed by Welborn and English (193Y) and Gofer (19l*l), that is, 
memory for verbatim content vs. memory for mea ningful , content. A further 
distinction is that between learning ("what has actually been stored") and 
performance (what the individual can retrieve from memory , and what he can 
do with It). The tough problem for the would-be meas-urer is to determine 
exactly what is perceived or comprehended at the time of original presenta- 
tion and what residual perceptions or comprehensions remain at the time 
when retention is tested. In many studies of retention, there has been a 
failure, either partial or complete, to determine what was comprehended at 
the time of original presentation. This must be borne in mind in the subse 

quent discussion. 
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Chapter 4 

MESSAGE AND SOURCE-OF-MESSAGE CHARACTERISTICS 

This and the following four, chapters will examine the major types 
of factors in comprehension of, and learning from MVD. For purposes 
of analysis and exposition, these factors have to be discussed one 
by one. We will try to avoid artificiality in such an analysis by 
considering the relations between the factors as we proceed. 

The Comprehensibility of Texts 

What aspects of a text — its vocabulary, syntax, organization, 
style, content, etc.— make it relatively easy or difficult to under- 
stand as compared with other texts, holding constant such factors 
as the individual's competence with the language, his motivation to 
conprehend, his background knowledge, his interest in the material, 
etc.? 

Much of the research on this question has been conducted in the 
context of trying to assess the comprehensibility of printed texts, 
i.e., their "readability." We know much less about the comprehensibility 
of materials presented auditorily — i.e., their "listenability. 

This has led to some confusion, in, the sense that the readability of 
printed texts depends to a substantial degree on the reading ability 
level of the reader, or more specifically, on his ability to "decode" 
language from print. The. characteristics of printed texts that make 
them difficult to comprehend are in some measure (at , least for not- 
f\xlly-skil3.ed readers) those characteristics that m^e them difficult 
to decode into spoken langusige. Because of the vagaries of its 
orthography, English presents special difficulties in this respect; 
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we might expect somewhat different results if we were dealing with a 
language (e.g., Spanish or Finnish) whose orthography is more regular 
than that of English. One might wish that research on comprehensibility 
of texts in English had been initiated with orally-presented texts. 

Such research would have disclosed more readily the ch^acteristics 
of language that present difficulties in understanding apart from the 
decoding of print. The research could then have proceeded to investigate 
comprehensibility of written texts, noting those aspects of difficulty 
that may be peculiar to written or printed language. Instead, research 
has tended to proceed in the other direction: after a long period of 
research on readability, some efforts were made to apply the results 
to the comprehensibility of orally-presented texts. Only in recent 
years has there been some interest in the comprehension difficulties 
in orsJ-ly prcssntsd ms.'fcsriQls • 

It should be pointed out that there are likely to be comprehension 
difficulties peculiar to oral texts, for example, those connected 
with homophones (different words, perhaps differently spelled, which 
are pronounced with the same phonemes). Furthermore, research on the 
comprehensibility of orally-presented materials involves special 
problems such as the control of articulation accuracy, intonation 
and stress, dialect, signal-to-noise ratio, and speech rate. 

However, a large proportion of the characteristics . that make 
oral language difficult are the same as those that make printed 
language difficult. With appropriate caution, we can generalize 
at least some of the results obtained with "readability" research to 
oral language. Because of the extensiveness of readability research, 
our review will examine it first. 
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Readability Research 

Chall ( 1958 ), who made a detailed and scholarly review of the 
research that had been done through about 1953, indicates that 
early, in this century the interest was in assessing textbooks and 
supplementary reading material for the school grades j in the 193® ^ 
the needs of adult education prompted study of ways to identify 
easy reading for adults, and in the 19^0 's journalists and others 
concerned with mass communication media joined in pvirsuing this kind 
of research. Nevertheless, the basic techniques and assumptions have 
remained relatively the same until very recent years. The major 
assumption has been that linguistic elements — words, sentences, and 
other objectively identifiable features in prose— can be counted and 
somehow weighted to produce a "readability formula to indicate the 
reading ease or grade level of the material. In order to devise a 
mathematical prediction formula, it was necessary to have available 
an initial criterion of reading ease. Sometimes the criterion was 
purely judgmental. A somewhat more objective criterion was provided 
by measurements of readers' ability to answer questions covering 
reading material. A favorite criterion of this sort was the scale 
of reading difficulty, based on pupil's success in answering comprehension 
questions over the material, provided by the McCall and Crabbs ( 1926 ) 
series of paragraphs. A large number of formxilas have been developed 
and widely used to evaluate textbooks and reading material. Chall 
(1958) compares the merits and demerits of many of them; a somewhat 
more recent, but also very comprehensive, review has been provided 
by Klare (1963). Most investigators attempted to develop formulas 
that would be applicable over a wide range of reading difficulty, but 
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because of the materials and techniques employed, some were more 
appropriate at lower levels, others more appropriate at upper levels. 

On the basis of considerable research evidence, Chall concluded that 
"when used to appraise materials of intermediate-grade difficulty, 
the Lorge, Flesch, and Dale-Chall formulas assign similar grade- 
levels, which average well within one grade of each other," but 
that "above the seventh grade . . . the Lorge formula tends to give 
considerably lower indexes than the Flesch and Dale-Chall formulas, 
the discrepancy becoming larger as the difficulty of the material 
increases" (Chall, 1958 ; P* 95 )* 

Klare (1963) regards the Dale-Chall formula as the most accurate, 
the Farr-Jenkins -Pater son as the most convenient and easy to use, 
and the Flesch Reading Ease formula as the most popular. He also 
makes several recommendations regarding formulas for use in measuring 
special characteristics of material (e.g., abstraction level), or 
for use in special circumstances (e.g., measuring the difficulty of 
psychological tests and inventories), and mentions special formulas 
for the readability of material at the beginning reading level. 

Chall and Klare have also discussed the validity of the formulas. 
With the original criteria by which they were established— usually, 
the McCall-Crabbs paragraphs, the fomulas had correlations of 
about . 70 . Powers, Sumner, and Kearl ( 1958 ) recalculated four formulas 
using the 1950 edition of the McCall-Crabbs paragraphs, with the 
following multiple correlations corrected for degrees of freedom: 
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Formula 


Multiple 
Correlat ion 


Proportion 

Variance 


Flesch Reading Ease 


.6351 


.4034 


Dale-Chall 


.7135 


.5092 


Farr-Jenkins-Paterson 


.5837 


.3407 


Gurining Index 


.5865 


.3440 



All four recalculated formulas agreed much more closely with one 
another than the original Dale-Caall and Flesch formulas did. 

Nevertheless, these calculations may tend to overestimate the validity 
of the formulas because they merely reflect the capacity of the 
formulas to correlate with the criterion on the basis of which they 
were developed. ' tte evidence on the validity of the formulas against 
■■external" criteria is much more mixed. Although there are more 
positive results than otherwise against such criteria as reading 
comprehension, reading speed, readership, and writer ability, it 
cannot be said that the readability formulas available at the time of 
Klare^s review were of Impressive validity. Klare (1963, P. 155 ) 
stated that if attention is restricted to ■■modern^^ studies (those 
appearing in 191*6 or later), 35 had positive results, 9 "negatlve^^ 

(l.e., with correlations less than . 50 ), and 9 " indeterminate . ■■ 

Chall (1958, p. 157) pointed out that ■■of the diverse stylistic 
elements that have been reliably measured and found significantly related 
to difficulty, only four types can be distinguished: vocabulary load, 
sentence structure, idea density, and human interest.^^ Of these factors, 
vocabulary load " is most significantly related to all criteria of 



difficulty so far used." 
that "human interest" is 



Klare (1963) feels, with probable justification, 
not logically related to actual comprehensi on 
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difficulty; hence this factor should probably not be considered 
within the scope of comprehensibility measurement. Factors fallins 
roughly in the areas of "vocabulary load" and "sentence structure" 
accounted for most of the variance In two Independent factor analyses 
(Brlnton and Danielson, 1958; Stolurow and Mewman, 1959) 
originally published in 1935 by Gray and Deary. The opinion seemed 
to be widespread, early in the 1960’s, that further progress in the 
measurement of readability could be made only be refining measures 
of the limited number of factors that appeared to determine it. 

The results of readability assessment were often counter- 
intuitive. For example, Stevers and Stone (19^7) found that Koffha's 
notoriously difficult psychological writings were evaluated as "quite 
easy" by the Flesch formula, while William James’s pleasant and ea-ily 
read writings were evaluated as quite difficult. lockman (1958) 
actually found negative correlations between Flesch readability formula 
results and rated "under standab 11 Ity. " There were also justified 
warnings and cautions about the uncritical use of readability measurements, 
either in the selection of children’s literature of the wi-ltlng of 
"more readable" prose. Both Chall and Klare, in their reviews, 
stated that the manipulation of the elements of readability counted 
by the formulas could not be relied upon to produce more readable 
prose; Klare recommended that readability measurements be applied 
only lost l^-to measui.-e the readability of something already written, 
not to guide its writing. Nevertheless, the works of Flesch and others 
were widely influential in getting writers of material for education, 
business, or government to write with smaller vocabulary loads and 
simpler sentence structures. 

' 1G3 
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Such was the state of readability research and application 
around I960. The publication of Chall’s review in 1958 marked the 
Deginning of an era of intensified research. At least three trends 

Degan to "be evident: 

(1) Completely unmentioned in Chall’s review, and given only 
scant attention by Klare, the work of Wilson Taylor (1953) on the 
"cloze" technique attracted wide interest. The "cloze" technique 

i.as offered not only as an improved criterion measure for readability 
research, but also as a convenient and more valid measure of readability 
itself. (Its importance was minimized by Klare because it did not 
fit within his definition of a "readability formula. ) 

(2) The advent of greater precision in syntactical analysis 
throu^ developments in linguistics made more refined study of 
sentence complexity possible. 

(3) Advances in technology and in computer analysis of text 
made it possible to foresee the computerization of readability 
measurement (Smith and Senter, 196?; Shaw and Jacobson, 1968; 

Klare, Rowe, St. John, and Stolurow, 1969)- 

Taylor's judgment, in 1953> 'that "... a cloze score appears 
to be a measure of the aggregate influences of all factors which 
interact to affect the degree of correspondence between the language 
patterns of transmitter and receiver," and thus to be an adequate 
measure of readability, seems to have been reasonably well borne 
out by more recent research. In 1968, the National Council of 
Teachers of English, in cooperation with the National Conference on 
Research in English, published a pamphlet (Bormuth, 1968b) that 
reprinted a number of articles on readability, mainly oriented 
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arourd the use of the elose technique (Bormuth, 196Tt, 1968b; Klare, 
Coleman, 1968a, 1968b). Bormuth claimed at that time 
that "the readability formulas available only three years ago 
could, at best, predict only 25 to 50 percent of the variation ue 
observe in the difficulties of instructional materials," while 
"today, we have not one but several prototype formulas which are 
able to predict 85 to 95 percent of the variation." Bormuth was 
referring to his research (Bormuth, 1966b) in vdilch a wide variety 
of linguistic variables were used to predict close measures of 20 
passages selected to represent a wide variety of prose styles, with 
a roughly even distribution in Dale-Chall readability from about 4.0 
to 8.0 in grade level. Multiple correlations, even with as few as 
four variables, ranged up to .934.* Some variables, particularly 
those involving word counts, were found to have a corvllinear relation 
to the criterion. LltUe evidence was found for differential validity 
of readability elements at different levels of reading ability. 

Bormuth felt that further refinement of his results would make 
possible new readabaity formulas that would be not only highly 
accurate and valid, but also easy to compute and use. 

The reasons for the great "breakthrough" in readability 
measurement, according to Bormuth, were (l) the availability of the 
close technique as an improved criterion of comprehensibility, and 
(2) the availability, of new linguistic variables that could be applied 
to readability measurement. In Bormuth's 1968 article, it was stated 



*Such correlations must be viewed with some caution in view of the 
small N on which they are based. 
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"we have now learned enough to design much sounder readability 
formulas"; however, an improved readability formula for general use 
has not yet been promulgated. In any case, Bormuth believes that 
"most future readability formulas will probably be designed to provide 
a profile of the level of difficulty represented by each of the 

language features in a passage." 

The successes apparently achieved by this research have given 

new encouragement to the idea that elements of language found to 
cause comprehension difficulties can be manipulated in order to 
prepare material that will be more readable (Coleman, in press), 
or even more learnable (Coleman and Miller, 1968). This idea has 
yet to be tested extensively; it may be that manipulation of some of 
the newer linguistic var-iables will prove more effective than 
that of variables that entered the older readability formulas. 

The enthusiasm generated by the recent readability research must 

be tempered by certain considerations; 

How valid is the cloze technique ? This matter has already 

been considered In Chapter 3 , where It was pointed out that while the 
customary close technique (systematic deletion of every 5th word) 
produces scores that correlate satisfactorily with reading comprehension, 
scores involving only lexical (content word) deletions do not correlate 
with reading comprehension ability. Further, it was noted that 
cloze scores are apparently complex, reflecting not only reading 
comprehension ability but also a special ability to utilize redundancy 
In a passage. It was also noted that cloze scores do not ordinarily 
measure Information gained from a passage, but simply the under- 
standabllity of the passage during actual exposure to It. Now, these 
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possible defects of the cloze technique probably are largely irrelevant 
to readability research, where passages are graded in comprehensibility 
by averaging scores over readers, because variations in "redundancy 
utilization ability," if such exist, or in actual learning from the 
passage would balance out through randomization. Nevertheless, if 
the researcher is interested in grading passages for aspects other than 
sheer comprehensibility, he would be well advised to try the "progressive- 
cloze procedures utilized by Rubenstein and Aborn ( 1958 ) or Coleman 
and Miller (I968), or the procedure of deleting only content words 
employed by Rankin ( 1958 ). The usual cloze procedure may be thought 
of as a technique for detecting what may be called "local comprehen- 
sibility," i.e., the comprehensibility of individual sentences in their 
immediately surrounding contexts. To the extent that systematic 
deletions touch function words, cloze scores are not likely to be 
sensitive measures of comprehension of main ideas and conceptual 
organization in prose. 

2. H ow do cloze scores interac t with the overall readabilit y 
level of the material ? No clear demonstration is available that the 
same procer.ses of comprehension operate for materials of high and low 

difficxilty. 

3. How do cloze scores inte -ract with the characteristics of 
readers? Eormuth (1966a) attempted to answer this question by stratifying 
his sample of elementary school childi-en according to reading ability 

and calculating Interactions between ability level and various linguistic 
indices. A number of significant interactions were found, particularly 
for indices concerned with words (as opposed to claxises and passages 
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as a whole), but he attributed most of these to ceiling effects. 

His conclusion was that in general the same elements caused comprehension 
difficiilty at all the levels of reading ability he identified in his 
samples. Bormuth’s evidence is not sufficient, however, to rule out 
the possibility of meaningful interactions between cloze scores and 
reader characteristics . His reading ability levels were limited to 
those found from the 4th to 8th grade in a typical school system; 
they may not, therefore, have included very low or very high levels. 
Bormuth also failed to report whether the cloze scores themselves 
were linearly correlated with reading levels. Even if they were, 
curvilinearity might have arisen if a wider range of reading levels 
had been included. Bormuth (1968a) reports a number of very high 
correlations between cloze scores and vario\is other measiures such as 
conventional multiple-choice comprehension tests — correlations that 
approach unity when corrected for attenuation. However, these data 
were collected exclusively on elementary school students. Research 
using the cloze technique needs to be extended to include very high 
and very low reading ability levels. Coleman (1968a) worked with 
several variables such as word spelling and phonic regularity that may 
be peculiarly associated with readability at low levels of reading ability. 

4. How -practica-l will it be to use cl oze scores for other tha^ 
research purposes ? The advantage of a reading formula is that it can 
be applied directly, in the quiet of one's study, to measuring the 
readability of a text. Use of the cloze procedure, on the other hand, 
involves testing a group of readers, preferably varying considerably 
in reading ability, and averaging the results. Even after this process. 







however, the scores may have no absolute meaning. Bormuth (1968a) 
attempted to remedy this situation by statistically equating cloze 
scores to more conventional criteria of understanding. Two levels 
were chosen, (l) the "instructional level," traditionally understood 
(according to. Bormuth)' to be reflected by the ability to answer 
of comprehension questions over a passage, and (2)' the "independent 
study" level, represented by ability to answer 95 ^ of questions over 
a passage. A cloze score of (based on systematic deletions of 

every 5th word) was found to be equivalent to the "instructional" 
level, and a score of 57 ^ to the "independent study" level. These 
results are only a partial remedy for the problem; what is needed is 
a study of the equating of the full range of cloze scores to reading 
grade levels or the like, for groups of given characteristics. For 
example, an appropriate table of results would make it possible to 
find the appropriate grade level of a passage, given the average 
cloze score attained by pupils in any given grade. 

Listenability 

In the 1950's, specialists in oral communication began to take 
an interest in the "listenability" of materials presented orally. 

Texts to be presented orally were subjected to some of the same 
"readability" analyses that had been traditionally applied to reading 
materials. The evidence is very sparse as to whether such application 
of readability formulas is generally valid for the appraisal of whether 
a text is more "listenable" when presented orally. Part of the 
difficulty, of course, is that oral presentation of material entails 
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tvo opposite effects: on the one Mani, it eliminates some of the factors 
that affect readability, in particular, ability to decode print, and 
on the other hand, it introduces additional factors, notably the 
ability of the speaker to "deliver" -die message, and the rate of 

presentation. 

An examination of the meager evidence assembled to date forces 
one to conclude that the application of standard readability formulas 
to prose destined for oral presentation is rislqr at best. Nevertheless, 
all the studies examined, that seemed to be relevant to the pro clem, 
do show positive relationships; positive relationships are exhioited 
at all age levels. At the elementary school level, Rogers (1952) 
was able to make a valid modification of the Dale-Chall formula. In 
a careful study using 6th-graders, Allen (1952) found that when the 
Flesch Readability Index and Human Interest measures were used to 
contrive spoken film commentaries, the Readability Index correlated .. 
positively with pupil gain from pretest to posttest on each of two 
films, and the Human Interest measure did for one of them. Sentence 
length was the most important factor. However, the design of Allen 
experiment also suggests that another factor was operating, namely the 
extent to which the commentary followed a "patterned outline. 

Harwood's (1955) experiment, conducted at the 10th grade level, showea 
clear correlations between Flesch readability indices for seven short 
stories and pupil's ability to answer questions on them when presented 
auditorily. The pattern of resulls for these same paragraphs presented 
in printed fom was highly similar, except that for some of the more 
difficult paragraphs the comprehension scores for listening were 
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somewhat lower. 
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Evidence at the college level is more meager. Chall and Dial 
(1948) found that the Dale-Chall formula applied to radio news broadcasts 
tended to correlate with students' ratings of understandahility and 
comprehension, but the effect was noticeable only at the extremes, 
i.e., for very easy and interesting broadcasts as contrasted to very 
difficult ones. Beighley (1952, 1954) in a careful study of various 
speaker and presentation factors found that comprehension scores for 
an "easy" speech were in most cases significantly different from those 
for a "hard" speech; the speeches had substantially different ratings 
by the Dale-Chall formula, but were also differentiated in terms of 
their ratios of abstract to concrete material. Manion (1953) found no 
validity for any of the elements in Flesch, Lorge, and Dale-Chall formulas 
in predicting ratings of "understandahility" of a spontaneous group 
discussion by the participants therein; it is doubtful, however, that 
spontaneous speech that would occur in a discussion would exhibit the 
characteristics of formal speech prepared in advance, and Manion' s 
results therefore have questionahle applicability. 

Interest in measuring "listenability" of longer discourse seems 
to have declined since the 1950's. To date there seems to have been 
little attempt to apply any of the newer methods, such as the cloze 
technique, for this purpose. (Subsequent sections, however, will 
report a number of studies using the cloze technique, rote memorization 
scores, and various stylistic indices to appraise the comprehensibility 

of shorter discourse such as single sentences.) 

It would be desirable to establish baselines for the comprehensibility 
of verbal material presented orally (thus, without the intruding variable 

of reading ability), for comparison with data on the readability of 
the material when presented in printed form. The small experiment by 



Harwood (1955) is the only one that attempted to make such comparisons; 
it should be repeated on a large scale, at different grade levels, with 
more adequate samples, and with a greater variety of comprehensibility 

indices. 

Recently, a series of experiments on variables affecting the communica- 
tive effectiveness of teachers' lectures has been performed under 
the direction of Gage (1968). It has been demonstrated that teachers 
differ consistently in ability to give information lectures as judged 
■by pupil gain scores on comprehension tests . While traditional measures 
such as vocabulary load and sentence complexity have little or no 
validity in predicting gain scores, there is evidence (Rosenshine, in 
press) that measures of such factors as "vagueness (indexed by overuse 
of such words as very . -pretty . some, maybe , . etc . ) , "explaining links 
(skillful use of such words as therefore , because, etc.) and \ise of 
examples will yield valid predictions. This line of research is promising 

and important, 

Source-cf-Message Characteristics 

Petrie (1963) states: 

"Although a considerable- amount of experimental evidence indicates 
that source credibility influences opinion change... there is little 
experimental support for the assumption that soxirce credibility or 
source sincerity influences the amount of information learned and 
retained from an informative speech. Althouf^ Kelman and Hovland ... 
report that hifi^i school students were able to recall persuasive material 
more readily when it was presented by a 'neutral* source rather than by 

. 112 



-108- 



one vhich was 'negative' or 'positive/ most investigators report that 
so\irce credibility, source sincerity, and the audience's like or 
dislxke for the speaker have no effect upon the listener's comprehension 

of the message.'* 

Vocahulary Loeid a,s a Message Characteristic 

It is commonly recognized that one of the factors making a 
test easy or difficult to understand is its vocabulary load. Numerous 
studies conducted in the earlier years of the present century drew 
attention to the role of vocabulary load in creating difficiaties in 
pupils' comprehension in literature (Irion, 1921?), in social studies 
(Dewey, 1935a, 1935b), in science (C\artls, 1938), and other subjects. There 
has been much concern with developing lists of words graded in difficulty 
for various educational levels, usually based on frequency counts 
(Buckingham and Dolch, 1936; Rinsland, 19i^•5; Thorndike apd Lorge, 19^4). 

Measurements of vocabulary load have figxired prominently in 
readability formulas. According to Klare (1968), "Of the 31 formulas 
published up to i960, 17 use a word-count factor directly and most 
others a related factor ( e.g. , word length)." For example, the 
formula he regards as most accurate, the Dale-Chall formula, 

i 

contains a factor based on the percentage of words that are not included 
in the Dale list of the 3OOO words found to be known by at least 80 
percent of 4th graders. In the 376 passages in Books II to V of the 
McCall-Crabbs (1926) test lessons, the mean percentage of such 
words was 8.1011 with a standard deviation of 6.3056. (The distribution 
must have been considerably skewed; positively.) This had the highest 
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correlation, . 6833 , with the criterion, the reading-grade score of a pupil 
Who could answer one-half the test questions correctly. It may be noted 
that the Dale list is based not on frequency but on familiarity. 

Certain words such as bracelet , watemelon, and cabbag e appear on the 
Dale list despite having low frequencies in the Thorndike-Lorge list. 

Elley ( 1969 ) has developed a promising method for assessing 
readability solely on the basis of weights for noun frequencies. 

vocabulary load has also been shown to be a factor In the comprehen- 
sion of spoken material. For example, Yoakam (19‘*7) gave tests Involving 
three versions of a radio news story to groups of high school pupils, 
comprehension, as measured by a test that was the same for all groups, 
was easiest when the difficulty of vocabulary was low. 

Furthermore, vocabulary difficulty has been shown to play some 
role in learning. Hall (W?!*) had college students try to recall random 
lists of 20 words after serial presentation at the rate of 5 seconds 
per word. Mean recall for lists containing words of 1 -per-milllon 
frequency (by Thorndlke-Lorge counts) was 12.04; for word lists with 
10 -per-milllon frequency, the mean was 13.31» ond for word lists with 
30 -per-milllon frequency, 15 . 02 , all differences being significant. 
However, Tulvlng and Patkau (1962) found that while word frequency 
played a significant part In such free recall. It did not when the 
results were scored In terms of "adopted chunks," i.e., sequences of 
responses that preserved the order In which they stood In the original 
presentation. Word frequency was nevertheless related. In this study, 
to the mean size of the "chunks" adopted. Studies exploring various 
other details of the role of word frequency In verbal learning are by 
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Sumby (1963), Lloyd ( 196 ^^), Winnick and Kressel (I965), and Follettie 
and Wesemann ( 196 t). Without going into the details of these studies, 
one may conclude that the role of word-frequency is not simple. It 
would appear that the mere frequency of a word in large word-counts 
is not the crucial variable. Siunby suggested that there is a tendency 
for high-frequency words to be associated and learned on a semantic 
basis, while low-frequency words are associated on a phonetic basis. 
Winnick and Kressel 's results turned up the fascinating finding that 
frequency is highly correlated with meaningfulness and learnability 
for "concrete" words, but the correlation is insignificant for "abstract" 
words. Darley, Sherman, and Siegel ( 1959 ), Gorman (I961), Spreen and 
Schulz (1966), and Paivio (1969) have developed methods for scaling the 
abstract-concrete dimension of words. "Concreteness" appears to favor 
learning when the task requires production of the responses. It also 
favors recognition, according to Goman's results, but frequency operates 
in the other, direction. Both Goman and also Shepard (I96T) found 
that suhjecte are tetter able to recognize rare wor 4 s as being previously 
presented; apparently such words make a greater impression on the subjects 
when first presented, or are less likely to be confused with other 

words . 

With Anisfeld and Lambert's (1966) finding that "pleasant 
words are learned faster only when they are response-terns in nonsense- 
syllable-word pairs, the several variables considered here (frequency, 
abstractness-concreteness, and pleasantness, along with the type of 
learning task involved) are seen to have fairly complex relations that 
have not yet been adequately investigated. Exactly what Implications 
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these findings have for the learnabUlty of prose materials as a 
function of the characteristics of the words contained in them is 
not clear. However, most of the experiments have been conducted 
using college students who could he expected to know most of the words 
involved. Different results might be obtained if the experiments were 
conducted with elementary school or high school students with lower 
average vocabiaary levels. To put the matter in another light, 
experiments on learning, when the Independent variables are characteristics 
of the words to be learned, must take into account the degree of oompre- 

hension of the words on the part of the subjects. 

The role of words in making a text easy or difficult to understand 

is actually a very complicated matter: 

(1) Many words have multiple meanings and multiple grammatical 

usages. The simple word like can be used as a noun, a verb, an adjective, 
a preposition, a conjunction, an adverb, and a suffix, in various senses. 
This is the general phenomenon of homonymy. In spoken English, different 
words that have the same sound, as meet and meat, are called homophones; 
in printed English, different words that have the same spelling, 
as row ("array," or "to propel a boat") and rw ("quarrel") are called 
homographs. Frequency lists rarely take account of these multiple 
meanings and grammatical usages. It is possible, therefore, that even 
when a text contains words of apparently "hi^" frequency, the particular 
usages of those words may be of low frequency and hence may present 
considerable difficulty for comprehension. This matter has not been 
investigated systanatlcally, but representative researches touching 
on it are by Howards (I961i), Ammon and Graves (19®), ani MacGlnitie 

(1969) . 
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(2) Students differ enonuously in their vocabulary knowledge. 

A word may be totally familiar to one student, totally unknown to 
another, and known only in a different sense-meaning to a third. A 
fourth student may be able to infer the meaning of the word from the 
context. The effect of vocabxilary knowledge also may vary depending 
upon Whether the presentation is oral or written: for young children, 
listening vocabularies are larger than reading vocabularies, while for 
educated adults, reading vocabularies may actually be slightly larger 
than listening vocabularies. Research on student differences in 
vocabulary kjiowledge will be reviewed in Chapter 8. At this point in 
our review we can only say that we need more infomation concerning the 
"grade placement" of words. Some of the word lists previously cited 
(Buckingham and Dolch, 1936; Rinsland, 19^5) attempt to place words 
by grade level, but these lists extend only to the upper elementary 
grades. Dale and Eicholz (undated) issued around I96O a preliminary 
report of their research designed to produce lists for grades 4, 

6, 8, 10, and 12. Diederich and Palmer (1956) reported the difficulty 
in grade.-. 11 and 13 of 4,800 words from 6,000 through 20,000 in frequency- 
rank according to the Thorndike lists. Unfortunately, these lists 
are not organized and integrated in such a way as to permit convenient 
use. Even Thorndike and Lorge’s (1944) frequency list is organized in 
three separate alphabets-one for the 19,440 most common words and two 
for 10,560 other, less common words. Thorndike and Lorge sviggest 
gi’ade levels for the several frequency ranges, without citing any 
research basis for their suggestions. It should be borne in mind 
that word freouency is not a sxire guide to word difficulty (Gates, 
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Bond, and Russell, 1938. ) There are many low frequency words that are 
quite familiar to children, and yet some of the senses of high frequency 
words are unfamiliar even to persons at advanced educational levels. 
Furthermore, as Serra (195^) has warned, the mere simplification of 
vocabulary will not necessarily promote comprehension when the concepts 
being presented by a text are inherently difficult. 

Because of the limitation in scope announced in Chapter 1, we have 
not considered here the problem of difficulties of word percept i o n 
either in auditory or visual presentation. For a review of work on 
speech intelligibility, see Black (l96lb). Traul and Black (1965 ) 
showed that increasing word context aids word identification in aural 
perception. Klare ( 1968 ) has reviewed studies relating word frequency 
to tachistoscopic perception. 

Syntactic Factors in Text Difficulty 

Some remarks on this matter have already been made in Chapter 2 
(pp. 44- ^4-9). A brief but more analytic treatment is given here. 

Length of sentence or materica . Length of sentence is a frequent 
factor in readability formulas. MacGinitie and Tretiak ( 1969 ) found 
mean sentence length a better predictor of readability than a measure 
of grammatical depth (see below). Follettie and Wesemann (196?), 
Martin and Roberts ( 196 T), and Epstein and Arlinsky ( 1965 ) found 
length of sentence or paragraph to be a significant factor in ability 
of subjects to memorize or recall the material. However, , as was 
demonstrated by Schlesinger (1966b), length of sentence is not an 
important variable as when other factors are controlled, namely, 
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the grammatical construction of the sentence. This finding pertained 
to the sentence level. There has been little research beyond that of 
Lyon (1917) on the Influence of length on the learning of prose material; 

see Frase (196?) • 

cal structure . Recent psychollngulstlc research, Inspired 

mainly by the work of Chomsky (1957, 1965, 1967), has concentrated 
Its efforts on determining the role of grammatical structure In the 

comprehension and learning of sentences. 

Pbraee-stnicture constituents . Many techniques have been employed 
to demonstrate that sentences are perceived In terms of phrase-structure 
constituents. Huttenlocher shoved that at early ages children 

have difficulty, in fact, In perceiving separate words as constituents 
of phrases. The most cogent work on this problem has been done by 
N. F. Johnson (1965) and Martin (1970). The "click" experiment 
(Sever, 1968; Scholes, 1969), the "probe technique" (Ammon, 1968, 1969), 
and the eye-voice-span technique (Schleslnger, 1966b) are also useful. 

Sucl (1967) and Sucl and Gruenfeld (1969) have investigated the role 
of pauses . Wilson (1966) showed little effect of phrase structure for 

memory functions in young children. j 

cammst.icalness . Artificial materials can be constmcted with 

various degrees of conformity with presumed grammatical and semantic 
rules of the language. There is generally a high degree of agreement 
as to how "grammatical" a sentence is (Coleman, 1965b; Danks, 1969a, 

1969b; Danks and Lewis, 1970; Downey and Hakes, 1968; Stole, 1969; 

Tikofsky and Reiff, 1967; Tlkofsky, Reiff, Tikofsky, Oakes, Glazer, 
am Molnlsh, 1967), but under certain circumstances this is not necessarily 
the case (Maclay and Sleator, I960; Quirk and Svartvik, I966) . 
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Syntactlc anomaly . Detailed studies of the relation between 
grammaticalness and ease of learning have been focused on the variable 
of syntax. Significant positive relations have been found by Coleman 
(1965a, 1965b), Epstein (1961, I962), Johnson (1968a), Marks and 
Miller (1964), Martin, Davidson, and Williams (1965), and Wang (1970) • 
Lezotte and ?yers (1968) found a perturbation in this relationship 
in that semi-grammatical sentences were less well learned than 
sentences totally lacking in grammaticalness. Miller (1962a) found that 
grammaticality was positively correlated with intelligibility in noise. 
Rohwer, Shuell, and Levin (1966) found that noun pairs were better 
learned when they were inserted in simple declarative sentence frames 
than when they were simply connected by conjunctions. Salzinger and 
Eckerman (1967) found a positive relationship but pointed out that 
frequency effects could explain the results as well as grammatical 
theory; this type of explanation was also proposed by Goldman-Eisler 
and Cohen (1970). Fillenbatmi (1970) gave several reasons for cautioning 
against the use of memorial techniques to assess the comprehension of 
syntax. Salzinger, Salzinger, and Hobson (1966, I967) used variotis 
degrees of syntactic euiomaly in testing linguistic abilities Oj. middle- 
class and disadvantaged children. 

Semantic anomaly . Grammaticalness can also be studied by holding 
syntax constant but varying semantic features and subcategorization 
rules. Davidson (1966) and Stolz (19^9) found learning correlated with 
grammaticality as expected; Downey and Hakes (1968) did not. Apparently 
the critical factor is the method of meeisuring learning. 

The relative roles of syntax and semantics . This raises difficult 
theoretical and experimental problems. In general, as Schlesinger (1966b) 
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points out, "complete separability of syntax and semantics is an 
untenable proposition." One experimental approach has been through the 
study of what has been called the "footnote hypothesis," i.e., the 
notion that the basic meaning of a sentence is paramount but that the 
syntactic form of a sentence is remembered as a kind of "footnote." 
Positive evidence for this hypothesis has been found by Miller (1962b), 
Mehler (1963, 1968a) , and Morris, Rankine, and Reber (1968). However, 
Rosenberg (1968b) showed that when only one type of syntactical structure 
has to be remembered, syntactic complexity is not related to recall. 
Bregman and Strasberg (I968) also present negative evidence. Never- 
theless, the work of Sachs (1966, 1967a, 196Tb) shows that the syntactic 
form of a sentence is forgotten very rapidly in comparison to forgetting 

of its semantic content . 

If one is thinking only of comprehensibility, Hamilton and Deese 
(1970) claim that grammaticality is more important than semantics. 
Mehler and Carey (196T, 1968) show that changes in surface structure 
have a stronger effect than changes in base structure, and that syntax 

interacts with veracity. 

n.^«mmat.ical complexity . Efforts have been made to measure the 
overall grammatical complexity of a sentence and relate this to 
comprehensibility and to recall. Theory provided by Yngve (i960) 
has been utilized for this purpose by Bormuth (1964a), Brown (196T)> 
Forster (1967), MacGinitie and Tretiak (1969), Martin (in press), 

Martin and Roberts (1966), Nurss (1967).. Perfetti (1969), and Wearing 
(1970), but with somewhat conflicting results. For example, Bormuth 
finds "mean word depth" a better predictor of comprehension difficulty 
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than sentence length, whereas MacGinitie and Tretiak find the opposite. 
Wearing found sentences with low mean depth better remembered than 
sentences with high mean depth, whereas Perfetti found that depth had 
no influence. Nurss foiind that syntactical structure indexed by depth 
affects reading difficulty when measured by oral read.iiig errors, but 
not when measured by a picture comprehension test. 

Foppa and Wettler (1967), working with the German language, found 
the predictability of sentences best when the syntax was complicated. 
Martin and Jones (1965) found that highly redundant (i.e., predictable) 
phrases were learned faster than phrases with low redundancy . 

Order of approximation to natural language . An a iproach to 
controlling the net complexity— both syntactic and semantic — of a 



sentence for experimental purposes was originated by Miller and 
Selfridge ( 1950 ). They artificially constructed sequences of words 
with various degrees of statistical approximation to English and showed 
that the higher the degree of approximation, the better remembered 
these sequences were. Various issues raised by this research have 
been investigated by M. Brown (I966), Herrmann (1962), Knox and Wolf 
(1965), Lachman, Dumas, and Guzy (1966), Lachman and Tuttle ( 19 ^ 5 ), 
Lawson (1961), Pike ( 1969 )> Richardson and Voss (196O), Sharp ( 1958 )* 
Salzinger, Portnoy, and Feldman (1962), and Tejirian (1968). For 
6X8Jnpl6^ Tcjirisji* s rGsults s6Gm to j.nd.i.csi'ts tlist syntsjc is ths more 
ImportEmt factor with low orders of approximation^ while semantic 
factors are more important with high orders of approximation. Brown's 
and Herrmann's resiilts seem to disagree with respect to the role of 
woi’d frequency and familiarity; in the usual method of constructing 
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orders of approximation, word familiarity and tte familiarity of 
grmanatical sequences both tend to increase with order of approximation 

and thus constitute confounding influences. 

Similarity to oral lan pjuage patterns. Ruddell (1964) obtained 

results showing rather clearly that children’s performance in reading 
comprehension is parUy a function of the extent to which the syntactic 
patterns in reading material are similar to the patterns in their oral 
speech, even when vocabulary difficulty is controlled. This result 
lends further support to the notion that variations in the comprehension 
of different syntactical phenrmiena are to be explained in terms of the 

frequency and familiarity of those patterns. 

a„>,rv.Htv . Carey, Mehler, and Sever (19T0), Chal (196T), Foss, 

sever, and Silver (1968), MacKay (1966), and MacKay and Sever (1967) 
have studied the role of grammatical ambiguity in sentence comprehension. 
When ambiguous sentences are presented in isolation, comprehension is 
slowed even when the subject is not aware of the ambiguity. On the 
other hand, if syntactic expectations are bum up, the ambiguity is 
not perceived and comprehension is not slowed. In normal discourse, 
it is probably the case that grammatical ambiguity has little or no 
irf-luence except in extreme cases where the writer has failed to provide 
sufficient context for disambiguation. This topic deserves further 

Study. 

T..vica1 densliy . Follettle and Wesemann (1967) and Perfetti (1969) 
have studied the Influence of "lexical density" (the ratio of content 
words to total words in a sentence or pai-agraph) to oanprehension and 
recall, with results generally favoring the hypothesis that 
lexical density makes for more difficulty in comprehension and 

123 



- 119 - 



recall. Hieir results are not completely clear, however, and this 
topic also merits further examination. 

The role of different types of grammatical xin its . If lexical 
density is a significeuit factor, it is implied that content words 
carry more information than function words . Several studies have 
examined the roles of particular types of lexical units. Louthan (1965) 
and Weaver and Bic.’aey (I968) show that noxins, verbs, and adjectives, 
in that order, carry decreasing amounts of information. Other studies 
suggesting that nouns are the ones best remembered are those of 
Anderson and Byers (1968), Martin (1968), Martin, Roberts, and Collins 
(1968), and Martin and Walter (1969). Prentice (I966) found that 
sentences beginning with high response-strength noiuis were easier to 
learn than sentences ending with tdiose nouns. But even grammatical 
endings ax^ function words carry information (ps one might expect) 
as compared with a situation where they are absont, as Bogart z and 
Arl insky (1966) demonstrated. 

The role of elementary sentence transformations . There is a large 
literature, reviewed by Bever (1968), on whether sentences appearing 
in certain transformations (passive, negative, question) are harder 
to understand and remember than sentences appearing in the simple 
declarative form. During the early 1960 's, psycholinguists were 
exploring the hypothesis of "derivational complexity" whereby it was 
proposed that people xinderstand sentences by "detransforming" then to 
their bases structures, and that difficulty in understanding was a 
function of the amount of detransformaticn involved. Clifton (1965)# 
Clifton, Kxircz, and Jenkins (I965); snd Clifton and Odom (1966) 
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established that perceptions of sentence similarities uere those predicted 
by transformational grammar, but those findings were really Irrelevant 
to the hypothesis of derivational complexity. Representative studies 
supporting the hypothesis of derivational complexity were those of 
Miller (1962b), Epstein (1967), Bough (1965, 1966), and Halamandarls 
(1968). Schleslnger (1966b) felt that his evidence was equivocal, 
in View of the difficulty of controlling extraneous factors such as 
sentence length. Slobln (1963, 1966) and Thrner and Ro»netvelt (1967) 
are among investigators pointing out that much depends upon the Inherent 
semantic properties of the stimuli, e.g., whether the subject and object 
are transposable ("reversible"). Wearing (1970) found no difference 
in retention of active and passive sentences. Wright (1969) noted that 
When a subject Is required to answer a question based on a statement 
that has been presented lsm.edlately preceding the question, the latency 
of the answer depends on whether the statement and question are in the 
same (active or passive) voice; latency is lor«er when they are different. 
Ihls result argues against early versions of transformational theory, 
but Its interpretation in terms of current grammatical theory Is a 
matter too complex for discussion here. One hypothesis concerning the 
relative difficulty of the active end passive voices, proposed by 
Greenough and Semmel (1969) o.nd by Qoldman-Elsler, and Cohen (1970) 
is that active sentences are easier simply because, being more 
frequent in speech and writing, they are more familiar. 

Evidence that appeared to support the idea of derivational comp y 
In comprehension was provided by Savin and Perchonoch (1965), who 
claimed that passives, negatives, and questions took more space In 



o 

ERIC 



125 



- 121 - 



menory ‘tha.n simple active sentences. Several later experiments, e.g., 
these of Epstein (I969) and Simison (1969), suggest that Savin and 
Perchonock's results were an artifact resulting from difficulties in 
recall rather than comprehension. 

Sub.1ect~ob.1ect relationships . Much of the evidence on this whole 
matter suggests that through learning and familiarity, people come to 
expect that the first noun— phrase in a sentence will be an active subject, 
and tliat a later noxin-phrase will be tlie object of an active verb. 

This expectation constitutes a kind of "hexu*istic" in sentence c«nprehension 
(Sever, I968); the passive construction, on the other hand, is a signal 
that this heuristic will not work in a given case, with the result that 
comprehension is somewhat retarded. Evidence that hearers tend to 
seek out these subject-object relationships is provided by Blumenthal 
(1967), BlumentheLl and Boakes ( 196 T)» Clark ( 1969 )» Clark and Begun 
(1968), Huttenlocher, Eisenberg and Strauss (1968), arS. Huttenlocher 
and Strauss (1968), althou^ it should be cautioned that these writers 
disagree as to the interpretation of their data. On the other hand, 
when people are asked to rate the "importance" of various elements in 
a sentence, they tend to choose the gransnat i cal subject as most important, 
regardless of the construction of the sentence (Johnson, M. G., 196?; 

Segal and Martin, I966). This appears to support the idea that the 
grammatical subject is regarded as the "topic" and the predicate as 
a "ccxnment." 

Other specific grammatical phenomena . There are a large number 
of studies, concerned with the roles of various specific phenomena in 
grammar, which merit a listing by author: 
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(1) Phenomena of negation and veracity: Huttenlocher, Higgins, 

Milligan, atid Kaufftaan (1970); Jones (1966); Wason (1961, 1965). 

(2) Morphology: Bogartz and Arlinsky (1966); Bryk and O'Connell 

(1967); Martin, Davidson, and Williams (1965). 

(3) Mass vs. count nouns: Hatch (I969). 

(4) Verb structure: Fodor, Garrett, and Bever (I968). 

(5) Verb tense and other markers of temporal relations: 

Clark and Clark (1968); Clark aid Stafford (1969); Smith and McMahon 

(1970). 

(6) Cwnparative adjectives: Clark (I969); Clark and Card (I969). 

(7) Connectives and conjunctions: Katz aid. Brent (1968); 

Robertson (1966, 1970). 

(8) Einbeddings of sentences into other sentences: Hamilton and 

Deese (1970); Miller and Isard (1964); Scldesinger (1966b); Van Kekerix 

(1968); Marks (1967). 

(9) Relative clauses: Edwards (1969). 

(10) Nominal izat ions: Coleman (1964a); Epstein (1967)* 

(11) Anaphora and intersentence relations: Bormuth, Manning, 

Carr, and Pearson (1970). (This important study also contains much 
information on school-age children’s difficulties with a wide variety 
of grammatical phenomena.) 

Factors of Content, Organisation, and Rhetoric in 
Message Ccanprehens ion and Learning 

Content factors . Although "content analysis" is a well-recognized 
technique for the analysis of the propaganda value of messages or the 
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" themes" inherent in disconrses, to my knowledge it has not teen applied 
to the analysis of educational materials with respect to their relative 
comprehensihility . A priori , it has been considered that content 
interacts with the hearer/reader ’ s background of knowledge; a piece 
of discourse will be relatively easier for an individual who already 
has some familiarity with the content. Few studies of this assumption 
are to he found, however. Ausuhel and Fitzgerald (I962) found that 
general background knowledge in endocrinology facilitated the learning 
and retention of new material in this field, yet a somewhat similar 

study by Ausubel and Youssef (1966) tended to disconfirm the notion 

<■ 

that previous background helps. Mills (l 968 j, Mills and Nicolas- 
Fanouiakis (1966), and Mills and Winocur (1969a) experimented with the 
effects of rated "meaningfulness" of sentences (possibly a function of 
familiarity and background) but preferred to ascribe the effects to 
factors of associative strength (see below). 

There are few studies, also, of exactly what kind of content is 
best learned and remembered. In a previous section we have seen that 
nouns are found to be best remembered, followed by verbs . Gomulicki 
(1956) found that subjects remembered narrative sequences better than 
merely descriptive material; in fact, descriptive material was often 
transformed, in recall, into quasi-narrative form. Subjects evidently 
have a strategy of reading or listening such that they scan for the more 
"important" ideas. And even of these ideas they are more likely to 
remember those parts that are "topic" rather than "comment." R. E. 
Johnson ( 19 T 0 ) found that rated "structural importance" of elements 
of a prose passage was related to degree of recall. 



Associations among concepts in a text . It has repeatedly been 



demonstrated that if the words in a text are characterized by having 
many high-strength interassociationS; the text is more easily learned 
(Riegel and Feldman, I967; Sheldon Rosenberg, I965, 1966 a, 1966b, . 

1966c, 1967a, 196Tb, 1967c, 1967^, 1968a, 1968c, 1968d, 1968e, 1969, 
in press; Van Every and Rosenberg, 1969). 

Correlative ly, it has been demonstrated that as an. individual 
learns a subject-matter better, he has better-formed associations among 
the concepts (Gaa?dner and Johnson, 1967; P* E* Johnson, 1967a,, 1967b, 

1969, and in press; Rothkopf and Thurner, 1970 j Caplan, I968; 

Krueger, I968). 

The converse of these propositions is that incorrect or inappropriate 
word associations can interfere with comprehension (Hinze, I961). 

Concreteness and Imagery . Texts that have many words representing 
concrete ideas, as opposed to abstract ideas, are more easily comprehended 
and remembered. Yuille and Paivio (1969) offer evidence that thematic 
storage is in the form of imagery. This is backed up by considerable 
research on the role of imagery in recall (Begg and Paivio, I969; 

Paivio, 1969; Paivio, Yuille, and Madigan, I968; Paivio, Yuille, and 
Rogers, I969; Pompi and Lachman, I967). 

Yet, Brooks (1965) found that instructions to visualize had little 
effect on ability to recall a text, whereas accompanying the text with 
appropriate pictorial representations facilitated recall. Brooks (1967) 
also claimed that the act of reading suppresses visualization since 
reading and visualization would constitute two conflicting uses of the 
same sensory modality, 
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Organization of teybual materials . Since Briggs (1967 ) has already 
reviewed evidence on the sequencing of instruction, our consideration 
of organizational variables will be restricted to characteristics of 
textual materials and their effect on comprehension and recall. 

Lorge (i960) observed that there is no generally agreed-on 
procedure for measuring the organization of prose; he did, however, 
propose a method. Beighley ( 1952 ^ 195 ^) compared "well organized" 
and "poorly organized" speeches and found little effect of organization 
on comprehension as measured by a multiple— choice test. Other studies 
of the organization of oral materials are by Parker (1962), Darnell 
(1963), and Thompson (1967). 

Lee (1965) developed a method for generating textual materials 
with various levels of structure or organization; according to him, 

"the learning effects of level of structure depend upon whether the 
test is for main parts abstraction, within paragraph detail, or rote; 
and on the mode of presentation, and part-whole level used." 

A theoretical analysis of the effects of organizational variables 
was presented by El-Okby (1963). 

Recently, studies of organizational variables have focused on the 
detailed manipulation of logical structure (Dawes, I966; Tweney and 
Ager, 1969). Erase (1969a, 1969b, 1969c) has shown that the relative 
emphasis given to concepts and attributes in recall can be manipulated 
by different types of textual organizations. 

Deese and Kaiofman ( 1957 ) and Epstein (1963) have studied the 
effect of organization and structure on the temporal factors in recall. 
Epstein showed that structiued material is more rapidly acquired in a 
forward direction, while unstruct\ared material is more rapidly acquired 




t 



-126- 

in a reverse direction. iVase (l 9 T 0 a) found that inappropriate ordering 
of sentences impairs memory for relations among sentences more than it 
does memory for facts given by individual sentences. 

Rhetorical and stylistic factors . King and Gofer ( 1960 a) explored 
the possibility that stories varying in the ratio of adjectives to 
verbs (the "adjective— verb quotient") would systematically vary in 
ease of learning and recall; there was meager but suggestive evidence 
that low AVQ stories are easier to remember. (This would agree with 
the findings reported earlier that verbs are more likely to be remembered 
than adjectives.) 

Hiller (1968), Hiller, Fisher, and Kaess (1969); Rosenshine (in 

press) have studied the effect of a stylistic variable called "vagueness" 
on the effectiveness of teacher's oral expositions. Hiller (1968) 
showed that vagueness, indexed by the presence of mary words conveying 
indefinite quantity, approximations, probability, and the like, is 
characteristic of the speech or writing of an individual with low 
knowledge of a subject. Hiller ^ showed also that teachers whose 
speech is characteristically vague are less effective in promoting' 
learning in their students when they give 15-minute oral expositions 
on a topic. 

Amplification by expanding wordage might be thought to have 
desirable effects. Serra (l 95 ^); reviewing studies by Wilson ( 19 ^^) 
and others, pointed out that amplification does not necessarily produce 
desirable effects; sometimes it produces only confusion. Purpel (1961), 
•however, found that amplification was effective when the added material 
consisted of concrete examples of the generalizations presented. 




Amplification has some resemblance to the "added parts procedixre" 
studied by Rothkopf (1968c, 1969b). According to him, "In the added 
parts procediire, new material is gradually added to previous studied 
portions of a written instructional document until it has been presented 
in its entirety." Rothkopf found this procedure to be more effective 
than "comparable whole or part techniques" and offered conjectures as 
to why it was more efficient. 

Serra (1954) also considered the effect of simplification. She 
felt that simplification, like amplification, could sometimes have 
deleterious effects on comprehension and learning, especially when 
essential ideas or concrete examples were omitted. On the other hand, 
there are situations, as pointed out by Desiderato, Kanner, and 
Runyon ( 1956 ) and Rosenshine (in press), when simplification is effective 
because it eliminates redundant or unnecessary material. 

Rosenshine (in press) observed that teachers who use sequences of 
oral exposition in which a generalization is presented first, followed 
by an example, and then a restatement of the generalization, were more 
likely to produce knowledge in their students. 

Context factors . It is commonly observed that meaning is better 
conveyed when it is provided with appropriate context. Kaplan ( 1955 ) 
experimented with the degree to which precise meanings of particular 
words can be determined when increased degrees of context are provided. 
Werner and Kaplan ( 1950 ) and Braun-Lamesch (1962) studied the manner 
in which children can acquire word-meanings through the use of context. 

t I 

Tannenbau’;. ( 19 ^ 5 ) reviewed a number of experiments showing how a single 
"index" or "cue" (such as the name of a prominent person, a particular 
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headline for a news story, or even the simple word in a dialogue) 

can markedly affect the interpretation of text accompanying the cue. 

Context effects in the learning of continuous text have teen 
studied ty Bruning (19T0), Gagne (1969a), and Gagne and Wiegand (1970). 
Bruning found certain kinds of relevant contexts helpful; Gagn^ s 
studies, on the other hand, suggest that contexts such as superordinate 
topic sentences have an interfering effect at the time of original 
learning, tut a facilitating effect at the time of recall. Since it 
is difficult to make sense of these apparently conflicting findings, 
it is obvious that more study is needed of these natters . 
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C!hapter 5 

STIMULUS MODALITY IN LANGUAGE COMPREHENSION 

Language occtirs in either spoken or written form. Ouir concern in 
this chapter is with what, factors enter into the choice of these two 
modalities, either separately or combined, for optimal comprehension 
and learning. We will also have occasion to consider the extent to 
which pictorial and graphical representations, appearing either alone 
or as accompaniments to verbal messages, enhance understanding and 
learning. 

General, reviews of the problem 

The question of visml vs. auditory pi*esentation of material has 
been reviewed a number of times (Day and Beach, 1950; Henneman and Long, 
195^; Hartman, 196 I; Allison, ISSk) . All these reviews suggest that 
the matter is an ext?*emely complicated one; research seems to present 
conflicting evidence on numerous points. Probably the most comprehensive, 
and most theoretically-oriented review, is that of Travers ( 1967 ), who 
draws on a model proposed ty Broadbent (1958) to suggest that auditoiy 
and visml modalities constitute separate sensory channels which have 
to operate independently, and that either channel can become overloaded 
with information. Thus, Travers believes that combined audiovisml 
presentations are often less beneficial than presentations throvi^ 
single channels, because oonbined presentations require rapid alternations 
of attention and may cause overloading of the separate channels . Travers 
( 1966 ) conducted a series of studies that in general support this theoret- 
ical position; some of these studies relate to the reception of verbal 
messages. Travers' position, incidentally, is diametrically opposed to 
the position reached in Day and Beach ' s review, which claimed that the 
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studies done up to that time consistently obtained an advantage for 
simultaneous audio and print channels over either channel alone. 

May's reviews of "enhancements and simplifications" of audiovisml 
presentations (May, 19658 ') 8nd of word— picture relationships (May, 1965 b) 
are also relevant to the subject of this chapter. May takes no definite 
position on the question of whether combined audiovisual presentations 
are superior to presentations through a single channel, but points 
emphatically to the need for detailed research. 

Before considering comparisons and combinations of channels, we 
shall take up studies on single channels. 

Audition (Listening) as a Cha.nnel for 
Language Comprehension and Leaniing 

Reviews and bibliographies 

It is only in the last 15 or 20 years that educators have devoted 

much attention to listening. Bibliographies and reviews of research are | 

by this time quite extensive (Keller, I 96 O; Duker, 196^^, 1968, 1969; j 

Devine, I 967 ; Wilkinson, 1970). One has the impression, however, that \ 

research in listening has not been sufficiently penetrating and analytical. j 

Much of the research seems to heive been intended to establish listening | 

i 

ability as a valid objective for the educational program, without j 

determining its nature and parameters in a precise manner. 

Theory of listening behavior 

It cannot be said that there exists any comprehensive theory of 
listening behavior in relation to language behavior in general or to 
other modes of language reception. Ze]Jco (195^) contributed s semi- 

? 

popular outline of aspects of listening. Bakan (1956) questioned some | 




of the assumptions that seemed to be prevalent among teachers of 
listening: that listening is a unitary skill, that uniform training 

in listening should be given to all students, that listening skill is 
teachable, that listening skill is relatively independent of other 
psychological variables, and that the effectiveness of training in 
listening can be evaluated by means of a test of listening at the end 
of the training period. 

Listening should be viewed merely as one modality of language 
reception, affected by all or nearly all the variables that are germane 
to the other principal mode of langmge reception, reading. Thus, 
comprehension by listening is affected by the nature and source of the 
message, the conditions under which it is presented, and the character- 
istics of the listener. 

Studies of li s tening behavior 

Hie literatui'e search conducted for the present monog2raph failed to 
turn up studies tliat delineate the parameters of listening behavior 
(apart from studies of speech intelligibility, which are not considered 
in this review) . Most studies of listening are concerned with comparisons 
with reading, discussed below, or with measurements of individual 
differences, treated in Chapter 8. However, Foulke and Sticht (1969) 
have reviewed a number of studies which focus on listening. 

O'Neill (1954) found that many people can make appreciable use of 
visual cues (by watching lips, presumably) to gain information from 
speakers, particularly i.n the presence of interfering noise. 




Vision. (Reading) as a Channel for 
Language Comprehension and Learning 

The large amount of research on reading, summarized in reviews such , 
as those of Anderson and Dearborn (1952) or Williams (1965), has been 
concerned mainly with the teaching of the elementaiy skill of "decoding" 
print into an. analogue of speech, or with accelerating reading rate and 
similar matters. There has been much less attention paid to the general 
problem of comprehending language through reading, and to the different 
kinds of purposes for which reading is done (Hall, I969) • Reading 
comprehension as a topic in itself has been treated by only a few 
writers, e.g., Kingston (1961), Mehler (1968b), Pickford (1933), Piekarz 
(1956), Ryan and Semrael (1969), Schoeller (l950), Wiener and Cromer 

(1967). 

Studies of reading comprehension 

One of the first to study processes of reading comprehension was 
Thorndike (1917a, 1917b, 1917c), "who pointed out that reading is 
essentially a reasoning process and therefore considered mistakes in 
reading as being largely errors in thinking. Touton and Berry (1931) 
analyzed 20,003 errors in comprehension made by college entrants and 
found that most of them related to inability to understand the details of 
questions, or to isolate or relate specific elements in the material. 

Gray (1951) attributed diffic\fLty in reading comprehension to the nature 
and difficTilty of the concepts involved, the way in which they were 
expressed, or inherent limitations of the reader. 

Goodman (1969) and Goodman and Burke (1969) have made refined classi- 
fications of oral reading "miscues," i.e., errors in producing spoken responses 
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to match the text, among children in grades 2, 4, and 6. While many of these 
miscues are due to failure to reco^ize words, the majority of them appear to 
arise from the building up of incorrect expectations about the text . 

Goodman thinks of reading at this level as a "psycholinguistic guessing 
game” in which the reader attempts to guess what the text is saying, 
often by inferring deep structure and producing a surface structure with 
an incorrect transformation. As Euddell (1965) has shown, the child 
is more successful when the language of the text corresponds to his 
oral language habits. 

Studies of reading comprehension processes among high-school-age 
children are those of Bell (1942) and Jenkinson (l95T)- At this level, 
few errors are due to faulty word recognition; some can be attributed to 
faulty habits whereby the child does not adequately attend to details in 
the text. Of course, some difficulties stem from inadeqmte vocabulary 
knowledge, but most errors are due to faulty thinking and reasoning 
about the message. Jenkinson provided a detailed classification of the 
errors children made in attempting to perform the cloze task on a 
variety of types of literature. Subjects exhibited not only problems 
in comprehension of materials but also in making appropriate inferences 
from these materials. 

Other useful studies of processes of reading comprehension are 
those by Bormuth (1970"^), Fagan (1969), Ifecnamara, Feltin, Hew, and 
Klein (1968), Pickford (1933, 1935), and Swain (1953)- 

Reading rate . There are few good otudies, surprisingly enough, 
on the parameters of reading rate in relation to difficulty of material, 
educational level of the reader, and the purpose of reading. Broad 
generalizations such as the statement that the average college student 
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reads at 275 words per minute have little meaning. One study that begins 
to provide adequate parametric Information Is that by Kershner (1964). 
Kershner measured reading rates of the adult population hy a door-to- 
door survey, using materials of different levels of difficulty and 
Investigating the effect of requiring the reader to answer questions 

based on the material. 

The possibility that some Individuals attain, or can he taught to 
attain, very hl^ reading rates while preserving comprehension Is a 
highly controversial question. Berger ( 1968a, 1968b) and HxfLtgren 
(1968), after reviewing the evidence, are rather skeptical that 
abnormal by hl^ reading rates can be attained without loss of compre- 
henslon. The physiological limit for reading speed "taking in every 
word" Is estimated to be about 8OO words per minute. Nevertheless, 

Schale (19T0) renders a preliminary report about two very "gifted" 
readers who appear to have broken through the physiological limit. 

Subvocalization . Edfeldt (1960) reported that degree of subvocal- 
ization during reading, as Indexed by electromyogt^iphlc recordings, is 
related to the difficulty of the material being read. McGuigan, Keller, 
and Stanton (1964) reported a variety of covert language responses 
during silent reading but did not relate these either to comprehension 
or to difficulty of material. On the assumption that subvocalization 
tends to retard reading speed, Hardyck, Petrlnovlch, and Ellsworth (1966) 
developed a conditioning technique whereby such subvocalization could be 
Inhibited. The relevance of subvocalization to reading comprehension has 
yet to be elucidated. 



Eye-voice span . If dui’lng oral reading of a passage a reader is 
suddenly prevented from viewing the material, the number of words he can 
report ahead of where his view was blocked is a measure of eye-voice 
span. Several Investigators (Lawson, I96I; Levin and Cohn, I967; 

Levin and Jones, I96T; Levin and Kaplan, I9665 Levin and Turner, 19665 
Wanat and Levin, 19 ^ 7 ) have used this technique to investigate the role 
of various message factors, principally grammatical structure, in 
reading. Resnick ( 1970 ) concluded on the basis of her experiment that 
syntactic competence is learned independently of perceptml control, 
but that the latter is necessary for the former. Mehler, Bever, and 
Carey (1967) concluded from studies of eye-movements thiat adults acquire 
the habit of fixating on the first half of phrase structure constituents. 

Listening vs. Reading 

For a long time, educational psychologists have beeii trying to 
answer the question: do people learn best by hearing spo.ken discourse, 

by reading printed discourse, or by having some kind of combined 
experience vrith hearing and reading? 

This is a difficult question to answer even if we exclude problems 
of the reciption of the signal, or of its perception. It is most 
important to control the time taken for the presentation; the .reading 
and listening abilities of the subjects are also important factors. The 
method of measuring comprehension and/or recall may give different answers 
(King, 1968c), In what follows, we summarize the existing knowledge, 
but it must be recognized that this knowledge is far from definitive. 
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A"b 1:116 elementary school level, material is usually found to be 
comprehended and learned better through listening (Carver, 193^ ^ 19^1; 
Caughran, 1953), but W. H. King's (1959) results are not clear on this point 
These findings probably reflect the immature reading skills of elementary 
school pupils. At the high school level and above, however, research 
results usually favor reading over listening (Beighley, 1952; Carver, 

1934, 1941; Ca\ighran, 1953; Cody, 1962; Henneman, 1952; Webb and Wallon, 
1956). Corey (1934), comparing learning from lecbures with learning 
from readings, fourd the latter more effective in terms of immediate 
recall, but the difference disappeared with time. 

In the above studies, little attention was paid to relative 
presentation times. Webb and Wallon noted that since the time necessaiy 
for the read-through of printed material was shorter ttian that necessary 
for its oral presentation, reading is a more efficient manner of learning 
from continuous discourse tlian hearing it. Webb and Wallon also 
established that if the time of exposure was held constant, i.e., when 
readers were allowed to see the material the same amount of time as 
hearers listened to the oral presentation, they made a significant gain 
in comprehension. 

The superiority of reading print over speech is partly a function of 
how fast an individual can read. In Chapter 6, we will consider the 
possibility that more efficient learning from spoken discourse mi^t 
be obtained if the speech were somehow speeded up. 

Probably the beet evidence on reading vs . listening available at 
the present time is that presented by King (1968c) and King and Madill 
(1968), who used both visual and aural presentations of stories of 
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several lengths, and both oral and written recalls. The recalls were 
scored in a nmber of ways to reveal scores on the two factors of "details" 
and "gist" that King had previously discovered as important and (relatively) 
independent dimensions of such recalls. In terns of memory for detailed 
factual information, visua]. and auditory modes of presentation are about 
equal. For "gist" and organized response, visual presentation is 
superior because subjects have more opportunity (even in equal time with 
oral presentation) to organize the material cognitively, "^ese results, 
incidentally, were obtained with college-age subjects. 

Little research (except that of Carver, 19^1; and Beighley, 1952) 
has investigated the role of the difficiilty (readability, listenability ) 
of the material. Carver's research suggested that the advantage of 
visual over auditory presentation increases with the difficulty of the 
material. Beighley' s results were equivocal on this point. 

However, research with nonprose verbal materials support the idea 
that visual presentation is increasingly advantageous for more difficult 
material. Both Schulz and Kasschau (19^6) and van Mondfrans and Travers 
( 1964 ) found that auditory presentation is significantly inferior for 
materials of high difficulty or low "meaningfulness" such as nonsense 
syllables or rare words. 

Kay ( 1958 ) produced evidence that there are individual differences 
in preference for sensory channel, most people preferring visual 
presentation for learning word pairs, but a few extreme cases favoring 
auditory presentation. We do not know whether such preferences also 
apply to prose materials. 
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Simultaneous Listening and Reading 



For elementary school children, research is available to indicate 
that, for example, it is advantageous to read aloud test instructions 
while the child reads along with this presentation. Undoubtedly this 
is true because of the immature reading skills of many children. 

At more advanced educational levels, however, combined auditory- 
visual presentation of connected prose either shows no advantage over 
visual presentation (audltoiy presentation being inferior at this level 
in any case) or actually constitutes an interference (Mowbray, 1953) > 
particularly if the materials are easy. This is probably because oral 
presentation tends to be much slower than what is possible in silent 
reading, and hence the two presentations are, so to speak, out of phase. 

Pictorial and Graphic Accompaniments 
of Verbal Messages 



Many aspects of the problem cf pictorial enhancements of verbal 
messages have already been treated by May (l985a> 19^5h) . Pictures 
may be of many kinds — schematics, line drawings, up to colored photographs; 
still or animated. An educational taxonomy of pictures has been 
proposed by Fleming ( 1967 )* The modern film has developed a language 
of its own; Forsdale and Forsdale ( 198 * 0 ) point out how foreign a film 
representation must seem to preliterate peoples. Jacob ( 1969 )^ however, 
claimed on the basis of a research study that the normal child of 11 
has mastered cinematographic language "in its entirety." 

Words vs, pictures . The research background for this section must 
be drawn primarily from studies that have Involved, not continuous prose. 
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but single words in conjunction with pictorial representations of those 
words. Research findings exhibit many inconsistencies that can probably 
be resolved only with the discovery and testing of the critical variables 

Bourisseau, Davis, and Yamamoto (1965) found that printed words 
produce more free associations that have " sense- impre ssion" Implications 
than picttires of the corresponding objects. Nevertheless, the proportion 
of such associations was relatively small. But since pictures are not 
tho\aght of as useful mainly for producing "sense-impression" free 
associations, this research of Bourisseau et al. seems of little 
relevance . 

Most researchers find thet ideas represented pictorially are more 
easily learned than ideas represented by single words (Jenkins, Neale, 
and.Deno, 196 ?; Lieberman and Culpepper, 1965)* Rohwer, lynch, Suzuki, 
and Levin (1967) found that memory for paired-associates was enhanced 
when pictures of them showed action (as opposed to still pictures). 

Hartman's study of memory for associations between names (printed, 
spoken) and faces showed no particular advantage for adding the visual 
dimension, but his experiment has little bearing on the problem because 
the learning of faces is itself a difficult task (faces probably being 
much less discriminable than names in either visual or auditory form) . 

The statement that adults generally have preferences for visual 
Information is supported by Lordahl's (1961) finding that in a concept 
discrimination task, subjeros were more likely to attend to visual than 
to auditoiy stimuli. Stevenson and Siegel ( 19 ^ 9 ) found that as children 
get older, they pay increasing attention to visual information in film 
presentations, and less attention to the auditory Information. 
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In view of the above research, one might expect to find that 
pictures do indeed enhance learning vrflien they accompany verbal presen- 
tations . 

Pictures accompanying connected discourse . The evidence for 
enhancement from pictures accompanying connected discourse is very meager 
and certainly inconclusive. Some positive evidence was obtained by 
Halbert (1943) and Strang (l94l), but negative evidence is afforded by 
the studies of Stutz (1945) and Dwyer (196?), for example. Dwyer 

I found, however, an advantage of abstract, schematic line drawings in 

i 

j the teaching of anatomy, vhereas realistic pictures were no better than 

strictly verbal presentations. Koenke (1968) found that pictures do 
not help elementary school children derive the main ideas from paragraphs, 
and W. A. Miller (1938) found that children’s understanding of elementary 
reading material was the same regardless of whether the material had 
accompanying pictures. Parsons and Prase (1968) reported that college 
students learn electrical circuitry principles just as well from verbal 
presentations as they do from graphic presentations. M. D. Vernon (1946) 
pointed out that students usually do not learn much. from graphs. Two 
studies supporting the advantages of pictorial presentations were those 
of Williams (I961), who found that students got higher scores on verbal- 
pictorial tests than on purely verbal tests, and Fredrick (19^9) > vho 
found students learned grammatical principles better from symbolic 
representations (tree diagrams of syntactical representations) than from 
I verbal statements . 

To conclude, pictures sometimes help the conveying of information, 
but generally they do not. Research is needed to determine what kinds 
of pictorial presentations en^^ance the transmission of information, and 
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under "what circumstances. Possibly the critical variable is the method 
of measuring learning. Surely some pictures convey certain types of 
information more efficiently than verbal statements, but it is difficult 
to test the acquisition of this information by purely verbal tests. 

Comparisons of Teaching Methods 
Employing Different Combinations of Audiovisual Techniques 

The finding of "no significant difference" between contrasting 
modes of audiovisual teaching is typical of a vast amount of research 
conducted in recent years. For example, Dworkin and Holden (1959) 
found no difference in the effectiveness of lectures and filmstrips for 
teaching principles of atomic bonding to graduate engineers. Eyestone 
(1966) found no differences between bulletins, films, and lectures in 
teaching if-H club information. It seems useless to review this research 
in detail not only because significant differences are seldom found but 
also because the results, obtained in situations where it is generally 
impossible to control variables precisely, yield little if any insist 
into processes of comprehension and learning from verbal discourse. 
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Chapter 6 



PRESENTATION I’ACTORS 

This chapter directs attention to a number of variables relating 
to hov a spoken or printed message is presented to tTie hearer or reader. 
The effect of these variables on either the comprehension or the learning 
of the message is considered. 

The Presentation of Spoken Messages 

The vocal skill of the speaker, an d re]ated variables . In the 
case of informative speaking, Petrie (19^3 ) regards the evidence on the 
effect of the speaker's vocal skill in delivery as inconclusive. Poor 
voice, quality, nonfluency, and even stuttering do not interfere 
significantly with comprehension. Nevertheless, in two separate studies 
Beighley (1952, 195^) found that students remembered more when they 
heard a speech given by a skilled speaker. In the second study, this 
was fo\xnd to be true both for immediate and delayed (two-week) recall. 
Thfe effect >ra,s more pronounced for hai*d as opposed to easy material. 
Coats and Smidchens (1966) found that students had better Immediate 
recall for the contents of a lecture when it was given in a "dynamic" 
manner rather than a "static" manner. Likewise, T., D. Skinner ( 19 ^ 3 ) 
found better immediate and delayed recall for a television presentation 
when given with "good" delivery as opposed bo "poor" delivery (an 
actor was trained to give both types of delivery) . One is inclined to 
conclude that manner of delivery does indeed make a difference, but 
research has not disclosed any explanation for the phenomenon. Possibly 
the effect of good delivery is to arouse greater attention. 
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Rozran (I 968 ) compared the effects of normal and "list" intonation 
on 4th grade children's comprehension of short informational passages. 

She found that list intonation appeared to aid the comprehension of 
difficult passages but impeded the comprehension of easy passages. 

Little research has been done on the effect of introducing pauses 
at phrase or other boundaries in a speech presentation. Bolinger and 
Gerstman (1957) showed that in the absence of other cues, acoustic 
pauses are capable of inducing a particular structural (grammatical) 
organization in speech perception. 

Dialect . Harms (1961) found that comprehensibility was greatest 
when spe8,ker and listener social status coincided. Weener ( 1969 ) 
found that children speaking the standard dialect had trouble under- 
standing a nonstandard dialect, but that children who were speakers of 
a nonstandard (Negro) dialect understood the standard and nonstandard 
dialect about equally well. Weener' s language samples were 1st, 2nd, 
and 4th order approximations to English. 

Foreign accent . Black and Tolhurst (1955) investigated the 
intelligibility of English spoken by French and British speakers and 
the effects of dialect familiarity of American listeners . The French 
speakers had a reasonably good command of English, but spoke it with an 
accent. French, British, and American listeners understood British 
speakers better than they did French speakers . After one hour of 
familiarization with the foreign dialect, American listeners significantly 
improved in their imderstanding of both French and British speakers . 

Thus, it would seem that the uivler standing of dialects and foreign 
accents is largely a matter of familiarity and learning. 
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Speech rate , investigators (Goldstein, 19^0; Diehl, White, 

and Burk, 1959; E, C. Miller, 195^) have found that over a wide range of 
oral speaking rates, e.g., from 100 to 200 words per minute, there is 
little effect of rate on comprehension, learning, or the listener's 
assessment of the speaker's quality of delivery. With the development 
of devices for accelerating speech rate without pitch distortion, it 
has become possible to investigate comprehensibility and leamability 
of material presented at much faster rates. This literature has been 
thoroughly reviewed by Foulke and Sticht (1969)- appears that 
intelligibility is maintained with little change, up to about 2?5 words 
per minute, although there is a slow decline in comprehension and 
learnablllty from about 175 wpm up to that rate . Beyond 275 wpm, both 
intelligibility and comprehension suffer sharp losses. Foulke and 
Sticht speculate that this is because speech processing (registration, 
decoding, ani storage) takes time and cannot be efficiently performed 

at rates above 275 wpm. 

Jester (1966; also see Travers, 1966) compared audio, visual, and 
audiovisual channels with respect to the effect of rate changes on 
comprehension, controlling time parameters for all three channels in 
a comparable way. Listening comprehension was found to be slightly 
superior to reading comprehension up to approximately 200 wpm, but 
inferior to reading comprehension thereafter. Mean comprehension 
scores for visual and audiovisual presentations showed a parallel decrease 
between 200 and 350 wpm, but in terns of efficiency (information gained 
per unit of time) the decreases Were not marked. Simultaneous reading 
and listening at 350 wpm resulted in better comprehension than could be 
demonstrated with either mode of presentation alone. 
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In a special issue of the Journal of Communication devoted to 
research and theory relating to compressed speech, Barahasz (1968), 

Foulke (1968), Friedman and Johnson (1968), Miron and Brown (1968), 

Orr (1968), Reid (1968), Sticlit (1968), and Woodcock and Clark (1968) 
have discussed various Issues related to the use of compressed speech 
in education. See also studies and articles by Barnard (19T0)^ Hoop 
(1966), Eckhardt (1970), En9 (1959), Fairbanks, Guttman, and Miron 
(1957a, 1957b, 1957c), Foulke (1967), Foulke, Amster, Nolan, and Blslcr^ 
(1962), Friedman and Johnson (1969a, 1969b), Goldhaber (1970 ), Goldhaber 
and Weaver (1968), Gordon, Gordon, and Perrier (1967), Cropper (19^9), 

Henry (1967), Langford (I968), Lavton (1967), Loper (1967), Michel- 
Miller (l970), Orr and Friedman (1967, 1968), Orr, Friedman, and Graae 
(1969), Orr, Friedman, and Williams (1965 ) , Robins (1968), Rossiter (1970), 
Sticht (1969, 1970), Voor and Miller (1965), and Wood (1966). 

A general conclusion seems to be that after an initial period of 
adaptation, many students, especially those >rLth above-average verbal 
abilities, can profitably learn from materials auditorily presented at 
rates up to 275 wpm. Such presentations ars, of course, most beneficial 
for blind students. Under certain conditions they can prof itably be used 
also with sifted stidents— for motivation and variety in the educational 
program, or to aid in the acquisition of reading or listening comprehension 
skill. Efforts to train people in the compr^ension of materials presented 
auditorily at rates beyoid 275 wpni Have thus far been essentially fruitless. 
Also> no effective way of improying the comprehensibiiity . of speech 
presented at veiy' fast rates has yet been found, but up to 275 ions 

in intelligibility are affected by . such factors, as speaker cha,iact eristics, 
method of compression, etc . ‘ • 
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Delaved auditory feedTjack . If a sulDject is required to read a 
passage aloud in such a vay that a recording of his rendition is fed 
into his ears vith a lag of about one-quarter second, pronounced 
interference with his speech is produced. This phenomenon is called 
delayed auditory feedback (DAF), and has been used in a series of 
researches by King and others to investigate the effect of this type of 
stress on comprehension, learning, and recall (King, I 963 , 19^5, 1968a, 
1968 b, 1969 j King and Dodge, 1965; King and Walker, 1965; King and Wolf, 
1965 ; Bernstein, 1962; Harper and King, I 96 T; Hassig and King, I 968 ) . 

King ( 1969 ) concl\ 3 ded that DAF apparently influences only the learning 
and not the recall processes. Since DAF uniformly retards learning, 
these results have no educational application other than to suggest 
that delayed auditory feedback and similar effects should be avoided. 

Distractions during listening . Broadbent (l952a, 1952b, 1956, 1958), 
Peters (1954a, 195^B), and Treisman (1964), among others, have made 
extensive investigations of the effect of noise and competing auditory 
messages on the comprehension of speech. As the competing messages 
become more similar to the target message, the interference becomes more 
pronounced. However, because of the characteristics of the auditory 
channel, Henneman (1952) found that the auditory channel was superior 
to the visual channel when the subject is required to pay attention to 
simultaneous messages (e.g., one auditory, one visual) or to perform 
visual or manual tasks. 

Festinger and I^ccoby (1964) found that visually distracting stimuli 
(e.g., films) tend to make people less resistant to a-uditorily-presented 
persuasive propaganda that conflicts with their opinions . 
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The rtesentation of Written Messages 

Format variables . Research on fonnat variahles has had two types of 
ohjectives: (1) to investigate psj'-cholinguistic processes, and (2) to 

test possible methods of improving the presentation of material for 
informative and educational purposes. 

Representative of the first type are studies by Graf and Torrey 
(1966), Epstein (196?), Anglin and Miller (1968), and Bryk and O'Connell 
(1967) . Graf and Torrey, and Anglin and Miller found that prose 
material was more easily comprehended or memorized ^en it was presented 
in physically separated grammatical \mits of phrase structure than when 
the segments were presented in Irregular relation to phrase structure. 
They used this evidence to argue for the "psychological reality" of 
phrase structure. However, Epstein found that "chunking" the material 
into phrases by typographical devices did not facilitate learning in the 
expected way. 

Representative cf the second type are studies by Hites (l95^), 
Klare, Mabry, and Gustafson (195 5b), Klare, . Shuford, and Nichols (1958), 
Hershberger (1964), Hershberger and Terry (1965), and Carver (l970c). 
Hites found that paragraphing, but not the use of subject headings, was 
effective in written presentations. Carver's study failed to find any 
significant usefulness for typographical devices to separate "chvinks" or 
phrase groups in increasing reading speed and comprdiension . (Cf. 
Epstein's result reported above.) The remainder of these stiadies suggest 
that only a limited fonn of typographical "hi^lig^ting" of important 
points (e.g., by. underlining or italicizing) is effective in pr<anoting 
comprehension. More complex types of highll^ting (e.g,, combined use 
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of full caps vs. lover case, different colored inks, and underlining) serve 
only to confuse and distract the reader. These researches, however, have 
not investigated all the possible types of typographical cueing; for 
example, it would he interesting to study the effect of outlining formats. 

It may he that training in the use of these formats would he necessaiy 
to make them effective. 

Rate . Since reading rate is ordinarily under the control of the 
reader there has heen little research on the effect of controlling 
reading rate except in the context of training programs for increasing 
reading rate. In films, it would seem that rates of presentation of printed 
material vary widely. Reid and MacLennan's (196?) sximmary of instructional 
television and film research contains no reference to research on rate of 
presentation of printed material that wouH he appropriate for various 
audiences. 

Gilbert (1959) collected useful data on the speed of processing 
visual stimuli and its relation to reading. Orr (19^4) speculated that 
maximum speeds of listening (to compressed speech) and reading are an 
index to the speed of "thought." 

Distracting stimuli . Reference has already heen made to the work 
of Henneman (1952) which found that requirements to perfonn visual or 
manual tasks simultaneously with reading generally interfere with reading 
comprehension. 

Freehume and Fleischer (l952) showed that presentation of various 
types of miisic to groups while reading had no sl^ificant effects on 
comprehension . McGuigan and Rodier ( 1968 ) observed that presentation of 
auditory language stimuli to a subject -v^o is reading pr^uces a greater 
amount of covert oral behavior, hut that white noise does not have this 




effect . 
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Chapter 7 

VARIABLES IN LEARNING FROM VERBAL DISCOURSE 

The previous three chapters have Been concerned with factors that 
could apply equally well to compr tension of verbal discourse and to 
learning from such discourse. Many of the studies pf these factors 
involved learning simply because learning measures were ^^he most 
convenient indices of comprehension- Indeed, in most cases the only 
available indices of comprehension. In the present chapter we shall 
consider variables that apply only to situations which demand that some 
form of learning from verbal discourse be demonstrated. 

Despite the fact that the phenomenon of learning is often considered 
to be the unique domain of psychology, and despite the long history of 
psychology's interest in learning, the field still resists satisfactory 
conceptual organization. Ihis is especially true in the case of learning 
from verbal discourse, because the traditional categories of learning 
theory-different types of conditioning, the laws of association, various 
experimental paradigms— do not seem to be readily applicable. Either 
the phenomenon of learning frcm verbal discourse must be regarded as 
constituting a new and unique paradigm in itself, or it may serve as 
the basis for achieving a rapprochement among the disparate theories and 
par-adigms of human learning. We would like to believe that the latter is 

the ccse, but the fulfilling of any such promise probably lies far in 
the future. - 

It would not be easy, for example, to fit meaningful verbal learning 
within the framework outlined by Gagne (1970). Gagne suggests that all 
learning can be classified into eight typesb signal learning, stimulus- 
response learning,, chaining, verbal association, discrimination learning,; 
concept laming, rule learning, and problem solving; Learning from 
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verbal discourse might be all of these, or none of these, 
treatments of "hxmnn learning" (e.g., Hovland, 1951 ; Hall, I966; 

Kausler, I966; Under.^ood, ISGh) are of little help. Useful analogies 
between prose learning and list-learning are difficult to draw, although 
some valiant attempts have been made (Goss, in press; Musgrave and Cohen, 
in press). There remains a considerable gulf between "verbal learning" 
research and the analysis of learning from meaningful discourse; 
nevertheless, it may be helpful in this chapter to utilize some of the 
kinds of variables traditionally considered in research on human learning, 
such as frequency and repetition. 

The point of view that we would like to espouse here is an 
"information-processing" view. It is close to the position taken by 
Ausubel (1968), who believes that meaningful verbal learning involves 
two processes: perception and cognition. According to him, "perception 

involves an immediate content of awareness before the intervention of .. . 
complex cognitive processes," while' "cognition involves such processes 
as relating the new material to relevant aspects of existing cognitive 
structure..." (Ausubel, 1968, p. 56 ). 

In the organization of this chapter, we will consider the process 
of learning from meaningful verbal discourse as a series of events, 
roiighly classifiable into three categories: (a) preleamlng events, 

such as the past learning history of the individual, or events immediately 
preceding the learning situation, such as the instructions given to an 
experimental subject or the sets or strategies that the learner brings 
to the learning task; (b) events during the learning process itself, i.e., 
during the presentation of the stimiilus; and (c) subsequent events, such 
as the cognitive organization or reorganization of the stimulus material 
as it is stored, in memory or retrieved for recognition or recall. , 
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Preleaming Variables 

Meaningful vs. rote learnin g. For the sake of completeness, and 
also to clear out some underbrush, we should first mention the question 
of "meaningful" vs. "rote" (verbatim) learning. It is rare in education, 
al^though occasionally justified, that the student is required to leam 
material verbatim. The more important kind of learning is for the 
substance or meaning of discourse. Yet a large amount of psychological 
research in verbal learning, even now, is concerned with the learning 
of the exact words of a sentence or passage that is presented. 
Psychologists have long been aware of the difference between meaningful 
and rote learning (Welbom and English, 193T); their preference for 
working with the latter has been dictated, for the most part, by the 
fact that verbatim recalls are much easier to score and quantify. 

We call meaningful vs. rote learning a preleaming variable because 
It is possible to instruct subjects in advance to leam either to 
retain ideas or to retain the exact words. Gofer (l9^l) did this and 
showed that these i>rocesses had somewhat different properties: verbatim 

learning takes more time than meaningful learning ("logical" learning, 
Gofer called it); time required for verbatim learning increases much 
more rapidly, as the length or quantity of prose material increases, 
than is the case for meaningfvil learning; and there is faster forgetting 
for verbatim learning. These findings accord generally with those of 
Engli^, Welbom, and Killian (193^), who invented an ingenious method 
of measuring both rote learning and logical learning within the same 
subjects and within the same learning trials. In this latter experiment 
it may be presumed that some subjects were operating with a set to leam 
ideas while others were operating with a set to leam words more or less 
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by rote. Such sets could arise from various sources — a history of 
success in rote learning, a strategy adopted because of the attitude 
that rote learning is beneficial, and so forth. Welbom and English 
(1937) point out that success in meaningful learning is much more related 
to intelligence (thus, to general verbal ability) than success in rote 
learning. It is possible that learning for ideas is a strategy much 
more likeljr to be adopted by students of above-average verbal ability, 
while a strategy of learning for rote recall is one more often adopted 
by students of lower verbal ability. 

Since the time of Gofer's classic experiment on logical vs. rote learning, 
experimenters have paid little attention to this variable except insofar as 
their experimental designs may be such as to require rote learning. In 
studies of learning from prose that do not require rote learning, it is still 
possible that many subjects adopt a strategy that emphasizes rote learning, 
i.e., the memorization of sequences of words without understanding their 
meaning. Thus the variable of learner strategy has often been left uncontrolled,* 
possibly this accounts for conflicting resiilts in the literature of learning 
from prose. Techniques such as that enployed by English, Wblborn, and Killian 
(193^) co\ild be vised to determine the typical strategy of the subjects. 

In one of the few recent stxjdies of the effects of differential 
learr.ing instinictions (King and Russell, 1966), a "rather disturbing 
conclusion" was suggested: 

"When Ss [undergradtates taking an introductory psychology coursej 
are instructed to learn connected meaningful material on the 
basis of main ideas or essential ideas they tend to recall . 
proportionately more words , - letters, sentences , etc . , than ideas 
or sequences of words. On the other hand, when instructed to 
learn on- an exact wording or a word-for-word basis, Ss recall 
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proport Iona tely fewer words, letters, sentences, etc., and more 
ideas. Apparently, Ss rather consistently interpret instructions 
in learning connected meaningfial material in a manner not in 
keeping 'vrlth the expectations of Es. A great deal of research 
is needed on the interpretation of the learning Instmictlons hy 
Ss and the strategies they adopt to fulfill these instructions" 

(King and Russell, 1966, p. 482 ). 

It is possible that these results are in some way an artifact of King 
and Russell's experimental procedures or their methods of scoring 
recalls. Otherwise, - one is tempted to recommend more emphasis on rote 
memorization in order to promote more meaningful learning! 

[An experiment by Elley (1966) contrasts rote and meaningful 
learning, but his definitions of these terms do not correspond to 
"rote" and "logical" learning as used here: Elley' s tasks did not 
involve prose learning.! 

Intentional vs. incidental learning . This is a matter of whether 
the learner intends to learn, or at any rate, knows that he will be 
tested and has some motivation to do well on the test, or, on the 
other hand, is exposed to the material xinder the impression that there is 
no need for him to learn (sometimes under an instruction that directs 
him to learn or pay attention to some aspect of the material that is 
irrelevant to what he will eventmlly be tested on) . 

It is a matter of common observation, supported by a vast amount 
of earlier research, that learning from prose is better -v^en it is 
Intentional. IMder incidental Deaming conditions, the learner can 
easily read or hear a sample of . prose without paying attention to 
its meaning. 



Epstein (1967) and Epstein and Arlinsky (1965) found that structured 
material, i.e., syntactically well-fonned sentences, vas easier to 
learn than nonstructured material only "vAien learning vas intentional. 

Introductory material and "advance organizers ." It is the common 
practice of writers and lecturers to begin their presentations with 
" introductoiy remarks" that will help to structure what is to follow; 
indeed, this very sentence is an instance of this . To what extent 
does this introductory material aid in learning? 

A long series of researches on what Ausubel (196O) calls "advance 
organizers" is relevant to thid question. According to Ausubel, advance 
organizers are various kinds of introductory expositions which either 
piresent new, generalized concepts under which further detailed learning 
can be subsumed, or draw distinctions that enable the learner to 
discriminate the new concepts from those he may have established in 
his previous knowledge. Experimental studies by Aiisubel (i960), 

Ausubel and Fitzgerald (l96la, 1962), Ausubel and Youssef (1963, 1966), 
Scandura and Wells (1967), Grotelueschen and Sjogren (1968), Proger, 

Ifeiylor, Mann, Coulson, and Baynk (1969)^ and Allen (1970 ) have generally 
confirmed these notions. However, the usefulness of advance organizers 
seems to interact with the degree of previous knowledge of the learner 
or with his level of verbal ability in complex ways. Furthermore, 

Bauman and Glass (1969) obtained results suggesting that "organizer 
material" may be more useful "vdien presented after learning than before it. 

Other kinds of preleamlng Instructions and information . Erase (1969b) 
fomd that a paragraph providing a "conceptual structuring" of subsequent 
learning material improved later recall. Similarly, Merrill and Stolurow 
(1966) found that presenting Ss with a summary of an Imaginary science 
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prior to learning to solve problems in it did not take increased time 
but increased the number of correct responses during the learning session 
and on the test. Christensen and Stordahl (1955) failed to find any 
effect of organizational aids (summaries, outlines) presented prior 
to (or within) reading passages, but it is possible that the motivation 
and attention of their subjects (Air Force recruits) was poor. 

Tannenbaum ( 1955 ) showed that presentation of certain cues in 
advance of the reading of a passage had marked effects on the inter- 
pretations that the subjects made of these passages. Brooks (19^5 ) 
found that subjects instructed to vismlize a series of spatial 
relations described by verbal material ("Try to pictiire how this scene 
would look") had no effect on their learning. Brooks also found that 
prior learning experience with visml representations of similar 
sentences, or viewing of isolated pictures of the objects in the pictures 
had no effect either. 

Advanc e Inhlb Itor s . If advance "organizers" can have a salutary 
effect on the learning of meaningful prose, can advance presentation of 
dissonant, interfering material inhibit learning? Th.'.s is the general 
question of "proactive inhibition" which has been widely studied in 
verbal learning research. Of course, in a very general way, all the 
individual's previous language habits are likely to interfere with new 
learning, as is shown by the error analysis of recalls (Gofer, 19^3) or 
in attempted serial reconstructions of approximations to English 
(Coleman, 1962b) . 

Proactive interference in rote prose learning has been demonsj^rated 
by Slamecka ( 196 I) and Mills and Sacks (I 967 ), among others. Ausubel 
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and Blake (1958) and Entwisle and Huggins (196^) have demonstrated its 
operation also in meaningful prose learning, hut its effects can he 
reduced hy the careful drawing of distinctions and contrasts so that 
the learner can reconcile the apparent inconsistencies. Ausuhel, 

Stager, and Gaite (1969) were in fact able to eliminate xts effects 
entirely, even when the interfering material was overleamed. 

Wittrock (1963) found that the learning and retention of differences 
were enhanced by the use of explicit directions to notice the differences 

Questions presented prior to learning . It is frequently the case, 
in instruction, that teachers or textbook writers pose questions for 
their students or readers to be alert to find the answers for during 
subsequent presentation of learning material. What effects do these 
questions have? 

Erase (l968d, 19T0b) has reviewed a considerable amount of research 
on this matter. While pre-questions do have certain positive advantages, 
they also have the disadvantage that they cause the learner to focus 
attention on certain aspects of the learning material, and to pay less 
attention to other aspects that may be equally important. Peeck’s 
( 1970 ) research confirms this generalization. It is usually better to 
insert questions at certain strategic places within the instruction, 
or even to present the questions after the instruction (with or without 
opportunity for review). This matter will be discussed below. Thus 
far, research has not Indicated what the effects of questions presented 

both before and after instmction will be. 

To minimize the disadvantages of pre-questions, it mig^t be thought 
that hi^ly general types of questions could be used. Nevertheless, 
Erase (1968b) found a result opposite to this prediction. 
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On the whole, research suggests that the use of orienting questions 
be avoided. It is much better to ask the learner to absorb as much as he 
can from verbal instruction. 

Variables Operating During the Learning Process 

Length-time relationships . There has been insufficient attention to 
the parameters of meaningful prose learning with, respect to the length of the 
material and the time required for learning to different criteria by learners of 
different abilities and with different methods. Lyon (1917) provided data 
showing that for passages of 1000 words or less (poems) , time of learning 
(presumably by rote) increases approximately linearly with length. Over the 
range 25 to I50 words, Gofer (l9^l) also found approximately linear relation- 
ships for both verbatim and "logical" (idea) learning, but the slope was much 
steeper for verbatim learning. It is interesting to note that the linear 
relationships foimd for meaningful prose, whether by verbatim or logical 
methods, are strikingly different from the generally logarithmic length-time 
relationships found for nonsense. material (Hovland, 1951j p« 620-622) . That 
is, additional increments of nonsense material take proportionately more time 
to learn as the length of the material increases .. Evidently the structured, 
grammatical, semantic aspects of prose material do not have this incremental 
effect . ' ■ 

King (1970); liowever , failed to support the total -time hypothesis (that 
constant amounts are learned in equal amounts of time) with serial learning of 
connected discourse over the range 10 to iiO words in length. Tulving (1967) 
suggested tha!t the limit for memory is set by the number of accessible memory 
units, but not by the contents of those uinits. He also noted (196I1) that while 
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intertrial learning may increase logarithmically, intratrial learning is a 
(different function and may increase linearly. Length-time relationships for 
prose learning need much further investigation. 

Frequency and repetition . Actually, these are somewhat different concepts 
or variables. Frequency is probably test applied to the notion of the 
frequency with which words, concepts, ideas, sentence patterns, etc. have 
been experienced in the past history of the individual; thus, it corresponds 
roughly to familiarity. Underwood (1959) and Underwood and Schulz (i 960 ) 
review evidence that "the frequency with which verbal units have been 
experienced is the fundamental variable responsible for the characteristics 
which have been used to define meaningfulness," and suggest that an under- 
standing of the role of frequency (in this sense) is important in shaping 
the educational endeavor. In the present review we have seen many illustrations 
of the importance of frequency. 

Repetition , on the other hand, is usually applied to the number of times 
that an individual is exposed to a learning experience in either a classroom 
or an experimental learning situation. In an experimental situation, it 
corresponds roughly to the number of "trials" that are given. A large number 
of the learning experiments reviswed here use multiple trials, and it is 
practically a universal finding that the more trials there are, the more learn- 
ing there is (up to a point of diminishing ret\irns) . This is reflected in the 
characteristically negatively accelerated learning c\irve when amount of learn- 
ing is plotted against number of trials. Furthermore, retention is a positive 
function of amount of repetition. This is so general a finding that it is 
hardly necessary to review the evidence for it*, it applies to. meaningful prose 
learning as well as it does to other' types of learning. Repetition is almost 
always involved in studies of length-time relationships discussed above. 
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Meaningful prose learning does, however, have some special characteristics 
with respect to repeated exposure. Rothkopf (1968a) found that Ss pacing 
themselves on repeatedly reading informative passages took less and less time 
with successive readings. Clark ( 19 ^ 0 ) found that successive reproductions 
of a passage were increasingly accurate, up to a point, even without reexposing 
the individual to the original passage. There is an apparent conflict between 
this result and that of Howe (l 970 ),who found that with repeated weekly 
presentation and recall of meaningful prose the subjects tended to persist 
in the errors made early in this series of trials, even though they had re- 
peated opportunity to correct themselves by inspecting the material. Howe 
feels that his results indicate that there should be an emphasis on the 
avoidance of errors made early in the learning process. 

Reynolds and Glaser ( 1964 ) found that various amounts of massed repetition 
of program frames concerning technical terminology in biology had little effect 
on learning, particularly as measured in delayed testing. These authors 
recommend that in programmed instruction, repetitions and reviews should be 
more widely spaced, since massed repetitions are likely to contribute to 
monotony. The above results were for materials presented visually to the 
subject. Jakobovits (1965) found that under intentional learning instructions, 
successive repetitions of prose presented auditorily gave increasingly higher 
recall scores; under incidental learning instructions, learning was slower and 
reached an optimum between 4 and 8 presentations, then declined. The difficulty 
of the material and the attitude of the learner were also important factors in 
this experiment. 

Other research reports that should be consulted concerning the effects of 
repetition, reexposure, and review, are those by Ausubel (1966a), Gibson (1965), 
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Kay (1955 ); Lachman and Dooling (1967); Rothkopf and Coke (1963^ 19^6, 

1968), Merrill (1965, 1970), Merrill, Barton, and Wood (l970) • 

Serial effects and order of presentation . Deese and Kaufman (1957) 
showed that for prose materials, in contrast to unorganized lists, items 
tend to be emitted in free recall in the order of their presentation. 

Using the method of stimulated recall, however, Rothkopf (1962) found.no 
significant effect of order of presentation. 

Tannenbaum (195^) fovnd that a series of news items occui’ring in a 
radio broadcast are recalled in somewhat the same way as unorganized materials, 
i.e., with the typical bowed serial position curve in which the last items are 
most likely to be recalled, the first items next most likely, and the middle- 
position items least likely to be recalled. 

Effects of context/organization, and sequencing . A general review of 
this subject has already been provided by Briggs (1967). Handler's (1967b) 
review of the effects of organization on memory is also of some use, although 
it pertains largely to memory for materials other than prose. 

We have already reviewed a number of studies (Merrill and Stolurow, I966; 
Christensen and Stordahl, 1955) that yielded somewhat conflicting evidence 
regarding the usefulness of outlines, summaries and similar organizational 
cues within a lesson. Northrop 's (1952) study of the effectiveness of 
organizational outlines in films suggested that such outlines are useful for 
"factual" films, but possibly inhibitory for "ideational" films. All these 
studies, however, pertain to specific information or commentary about the 
organization of the material, rather than the actual organization of the 
learning material itself.' In general, the research evidence suggests that 
the organization ' of the learning material often has considerable effect on 
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learning. For example, Eustace (1969) used ’’learning se'c analysis” to 
organize a program for the teaching of a complex concept — that of "noun” — to 
2 nd and 3 rd graders and found that a well -organized program was significantly 
more effective than one that was not organized according to "learning set 
analysis.” 

Gagne (1969a) and Gagne and Wiegand ( 1970 ) have studied the effects of 
putting several kinds of context sentences immediately preceding facts to he 
remembered. These sentences were "superordinate” (like topic sentences), 
"coordinate” (conveying a related fact), and "unrelated"; in addition there 
was an . ’’isolation" condition in which the facts were not accompanied with 
any kind of context. It was found, first, that having no_ context whatever 
promoted most recall, followed hy superordinate, coordinate, and unrelated 
contexts in that order. There were no effects, however, for recognition texts. 
In the second of these experiments it was found that the effect of the super- 
ordinate context was enhanced if it also preceded the recall test question. 

Bruning ( 1970 ) showed that facts could he better retained, in relatively 
short-term memory at least, when they were presented in relevant contexts, 
i.e.,with other facts about the same general subject matter. However, the 
order or organization of the various facts made no significant difference; 
they co\ild be presented , in random order as long . as they were on the same general 
topic. Bruning considered that his findings raise a number, of questions about 
the validity of Ausubel's notions about "organizer” concepts. Apparently the 
only "organizer” effect found relevant in Bruning's study was the topic it- 
self, which was constant in his relevant contexts but highly varied in the 
irrelevant contexts. 

Questions and other "mathemagenic'.' activities during learning . There is 
large research literature, already well reviewed by May (1966), Erase (l 968 ds 



1970b) , Anderson (1970) , and Rothkbpf (1970) concerning the role of various 
activities that the student can engage in, or be caused to engage in, during 
learning. Rothkopf (1965a) dubbed these activities "mathemagehic ," i.e.', 
^giving rise to learning" (from its Greek etymology). The assumption that 
underlies this work is that learning is strongly facilitated when the learner 
is somehow required to search his short-term memory for the answer to some 
question or problem; this process of searching, it would seem-, helps to place 
the item to be remembered in long-term memory store. The basic idea is not new; 
it was implicit, for example, in the 1917 research of Gates (see Hovland, 1951* 
p. 6U2) that showed that "recitation" (attempts to make active recalls) is far 
superior to passive review or exposure, and that the student can sometimes 
profitably spend up to Q0% of h is time in recitation of this sort . (Gates 
found that recitation is hot as profitable for prose as it is for hbns'ense 
material, but it is still useful for prose.) The idea is also implicit in the 
cbmmon bbservation that one learns a subject best when he tries to teach or 
write about it. 

Only in recent years have educational psychologists seriously' turned their 
attention to research bn utilizing this idea in instructional materials and 
procedures . Instructional materials do not ordinarily make good use of the 
principle. Fbr example,' a book or film usually contains no stimuli that force 
d student to engage in matheinagenic activities. If hb dbes so at all; it is 
because, perhaps, he has learned tkis strategy, or is forced to do so as a 
fesult of external circumstances. The promotion of mathemagenic activities oh 

I i 

the part of the student shoxild be considered one of the teacher's most important 



fvinctions . 
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Rothkopf's (1970) concept of mathemagenic activities is so 'broad as 
to include orientation ("getting into the vicinity of instructional obj'ects . . .") 
and o'bject acquisition ("selecting and procuring appropriate instructional 
o'bjects"), but pro'bably the most important class of such activities is what 
he calls "Class III: Translation and Processing." These include "scanning 

and systematic eye fixations on the instructional object; translation into 

internal speech or internal representations , the mental, accompaniments of 

<» 

reading; discrimination, segmentation, processing, etc." Translation, 
segmenting, and processing are stages of progressively greater depth and 
inaccessibility to external observation, but all three have memorial con- 
sequences that "become more complex and enduring as the depth of the actions 
increases." These Class III mathemagenic activities can be prompted and 
facilitated in many ways: 

(1) Interspersed questions . The effects of appropriately inserted 
questions have been extensively investigated (Prase, 1968d; Kantor, I96O; 

Kurtz, Walter, and Brenner, 1950; Hershberger and Terry, I96U; Pyper, I969 ; 
Piothkopf and Bisbicos, 196T)« In general, it is found that questions are better 
placed after the material to which they refer, but this is not always the case 
(Morasky, 1969; Morasky and Willcox, 1970). The interpretation of the question 
effect is still unclear; on the one hand, questions may have an arousal effect 
that influences and improves future learning (Natkin and Stabler, I969) , but 
they also may have the "backward" effect of maintaining existing reading be- 
haviors (Prase, 1968a; Watts and Anderson, 1970). Different types of questions 
can have different effects: "high level" analysis and evaluation questions 

seem to prompt more thorough study and cognitive reorganization, while factual 
questions influence only attention to facts (H\onkins, I968) . Entwisle, Huggins, 
and Phelps (1968) stress that questions are useful only when the student is well 
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prepared to answer them. Rothkopf and Bloom (19T0) found that the effective- 
ness of adjunct questions was increased if they were delivered hy a teacher 
rather than by a programmed text, but Thomas (19^6) found no such effect. 

More research is needed to determine exactly how adjunct questions have 
their effect; an interesting speculation that may be offered here is that 
questions are most effective when they not only cause memory search ^ but 
also cause some sort of reorganization of memory traces and associations. A 
better theory concerning the effects of questions would make possible the 
development of a science of question writing. Bormuth's (l9T0t) essay on 
achievement testing is a step in the right direction, based as it is on 
psycholinguistic theory, hut it probably fails to take adequate account of 
the mental processes involved in memory storage and retrieval. 

( 2 ) Constructed responses . In programmed instruction, it has been 
the practice to advocate, following Skinner (195^) > provision whereby the 
•student could fill in completions to sentences. Research, however, has shown 
that requiring the student to fill in a blank is often not necessary , and 
even time-consuming. For a time it was believed that "covert responding" 
was more effective and efficient. It is now believed (Anderson, 19T0) that 
the critical variable has to do with whether the program frame requires the 
student to perform some kind of memory search or cognitive reorganization. 
Thus, the research on overt vs. covert responding was often ambiguous because 

it did not consider the kind, of cueing received by the student. 

Some of the pertinent literature on this problem is by Hartman, Morrison 
and Carlson (1963), Ashbaugh ( 1964 ) , Goldbeck and Campbell (1962), Cartier 
(1963a), CoulBon and Silberman (196O), Crist (I966), Krmbolti 2 and Weisman 
(1962), and Williams (1^66). 

( 3 ) Statements of instructional objectives inter spersed In materials; 

Games, Johnson and Klare (1967) . 169 
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Spoken responses ; Kelslar and Stern, 1969b. 

Reading under cloze procedtire conditions ; Anderson, Goldberg and 
Hidde (1970). Louthan (1965) found that this procediire was most effective 
when determiners were omitted from the text. 

Avoidance of "strong prompts" ; Anderson and Faust (1967) found 
that programmed instruction frames making easy copying or identification 

of correct answers were decidedly less effective than frames avoiding such 
practices. 

(7) Imagery ; Anderson and Hidde ( 1970 ) found that asking the subject 
to visualize or image a situation described by a sentence was an effective 
way of forcing him to process the sentence meaningf\illy . 

^^^ylng out a phy sical response . Asher's procediire for requiring 
the learner to carry out a physical response corresponding to the meaning 
of a foreign language sentence (Asher, 1966) may perhaps be regarded as a 
variety of mathemagenic technique. 

(9) Guessing and se arching for answers : Berlyne (I966) claims, on the 

basis of his study, that forcing students to guess and then search for the 
correct answer arouses their curiosity; it may also be regarded as a 
mathemagenic activity. 

It should be mentioned thav Carver (l 970 "b) has severely criticized 
research on mathemagenic effects, on the following grounds: (l) failiire 

to control the total "running time" for the learning (with vs. without questions); 
( 2 ) failure adequately to control subjects' strategies in dealing with texts 
and questions; ( 3 ) failure to make the research externally valid by malting 
it more ccmpai’able to realistic learning situations, e.g., by allowing ^s 
to look back over reading material when confronted with questions; ( 1 |) failure 
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to relate the res\ilts to an adequate theory. Carver's criticisms certainly 
have some force; many of the points he raises should he made the basis for 
further experimental investigations. Some of Frase's and Rothkopfs conclu- 
sions on the use and placement of questions seem particularly suspect if 
Carver's criticisms are valid. Nevertheless, it is the judgment of the 
present writer that the basic notion of "mathemagenic activity" is a useful 
one, and that it will stand up in further critical testq,. 

Note-taking during audio presentations . Although note-taking is a widespread 
practice among students, there is little research that confirms its effective- 
ness in learning. Cody (1962) found that note-taking was better than merely 
listening. Minter, Albert, and Powers (1961) found a positive effect only 
for higher-intelligence and initially-uninterested groups. Ash and Carlton 
(1951) obtained the result that there was most immediate retention in a group 
that did not take notes during a film; a group that took notes during the film 
and reviewed them for 10 minutes afterward retained slightly less, and the 
group that took notes during the film and was tested immediately afterward 
retained the least. However, they pointed out that the note-taking probably 
interfered with learning because the films did not have the pauses and rep- 
etitions that wo\ild be necessary for note-taking. 

Berliner (1969) compared note-taking diiring a college lectiire with 
several procedures inspired by Rothkopfs "mathemagenic" hypothesis and found 
it to be less effective than those procedvires . Whether Berliner controlled 
"running time," as Carver (l970b) would suggest, is not clear from his report. 

Most of this research seems to fail to take account of the possibility 
that note-taking is a skill that must be learned to be effective. 
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Post-Learning Variables 

We will now consider a number of "post-learning” variables , such as 
reinforcement and feedback of knowledge of results , positive and negative 
transfer effects, and the phenomena of retention as measured by recognition 
and recall. 

It is sometimes difficult to decide whether these effects really occur 
after learning. Some of these effects can clearly take place during learning 
trials or sessions. Logically, however, they can best be regarded as post- 
learning events, if one takes the view that a learning event can occur in a 
small amount of time and that a learning session actually consists of a 
series of such discrete learning events. (We might have considered the use 
of "questions" as a post-learning variable.) 

Reinforcement and knowledge of results . It is outside the scope of this 
review to consider the difficult theoretical issues connected with whether 
"reinforcement" or "reward" as such has any effect on the kinds of learning 
that occur during presentation of meaningfiil discourse. In the first place, 
rewards or reinforcements are not normally forthcoming during such presenta- 
tions , unless one regards the acquiring of information as inherently rewarding 
(as it may be, under certain conditions and for certain people [Jones, 

Wilkinson, and Braden, I961, Rosen, Siegelman, and Teeter, I963]). It is 

# 

only through various external arrangements (e.g., teachers, use of programmed 
instruction formats, insertion of questions with answers) that any kinds of 
rewards or reinforcements accrue to the receiver of written or spoken 
instruction. Research on the role of reward and reinforcement has necessarily 
been limited to the study of the effects of such external arrangements . In 
the second place, it is extremely difficult (some believe it is impossible) 
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to separate the effects of "reward" or "reinforcement" as such, on the one 
hand, and of "knowledge of results," on the other. There is an extensive 
literature on these questions, accessitle through standard references on 
learning and learning theory. Our consideration of these issues will he 
limited to results obtained in the context of meaningful prose learning, 
usually in settings such as programmed instruction. 

nelav of feedback . According to conventional learning theory feedback 
and reward are most effective when given as soon as practicable after a 
response. Evidence is accumulating, however, that this may not be the case 
with meaningful prose learning. The responses, in this case, are the 
answers given by students to questions in tests of retention. What is not 
clear is whether the student should be informed of the correctness of his 
answers immediately after taking the test (as would be suggested by conventional 
learning theory), or aftersome delay. English and Kinzer ( 1966 ) and More 
(1969) have obtained experimental results that indicate that the feedback 
of information should be delayed to some extent; English and Kinzer found 
1-hour and 2-day delays superior to immediate feedback, on the one hand, and 
also to 1-week delay, on the other. More found optimal delays at 2 1/2 hours 
and 1-day, as opposed to immediate feedback and U-day delay. It is difficult 
to incorporate these results into existing theory, and they may lack external 
validity in view of the fact that multiple feedbacks , at several intervals 
of time, might be even more effective. Sturges (1969) Inferred from exper- 
imental results that feedback should include information concerning incorrect 
alternatives on a multiple-choice test, but Phye and Bailer (1970) were unable 

to replicate this, finding. 
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Other aspects of reinforcement . Krvunb.oltz and Kiesler (1965) found 
that partial reinforcement procedures (feedback for only a portion of 
test questions) made a program less interesting than one with 100^ re- 
inforcement and also reduced its effectiveness. However, on a 2-month 
retention test all differences between groups receiving different degrees 
of reinforcement duiring learning disappeared. 

Retroactive facilitation and inhibition . These effects are the counter- 
parts of proactive facilitation and inhibition that were discussed as pre- 

/ 

learning effects under the headings of "advance organizers*’ and "advance 
inhibitors." According, to classical verbal learning theory , as a subsequent 
learning experience becomes more similar to a previous one, there is more 
and more interference or "retroactive inhibition" (RI) on the retention 
of the original learning. This has been repeatedly demonstrated with list 
learning, paired-associate learning, and the like. Nevertheless, according 
to Hall (1966, pp . 61O-612) , it has been difficult to demonstrate these 
effects with meaningful learning. 

Among those who have been more or less successful in demonstrating RI in 
prose learning are Crouse (l970), Entwisle and Huggins (196U), King (1966) , 

King and Gofer (l960b), King and Tanenbavun (1963), Slamecka (1959, 1960a, 

1960b, 1962), and Tulving and Osier (1967). Mills and Winocur (1969b) 
found RI only with low degrees of original learning. Mehler and Miller (196U) 
used the RI paradigm in an attempt to demonstrate separate learning of 
syntactic and semantic components of sentences. 

Neutral or equivocal evidence for RI was obtained by’Ausubel, Robbins, 
and Blake (l95T) , Gofer (1955), Gaite, Ausubel, and Stager (1969) , Hall (1955), 
McGeoch and McKinney (l93l*) , Shuell and Hapkiewicz (1969)9 and Wong (l9T0). 
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Evidence for retroactive facilitation » i.e., a benign influence of 
subsequent learning on retention for the original learning, -was obtained 
by Ausubel, Stager, and Gaite (1968) even -when they tried to maximize the 
amount of interference that would be created by the subsequent learning* 

A wide variety of materials and procedures were used in these experi- 
ments . Presumably it would be possible to reconcile the apparently conflicting 
results of these experiments by fiarther experimentation with the several 
variables that may be affecting the results . Particularly difficult is the 
problem of controlling or at least assessing the similarities and differences 
between material for original and subsequent (interpolated) learning. The 
problem of the similarity paradox may be posed in this connection, as it has 
been in verbal learning research using nonprose materials ; If RI increases 
as similsirity between original and interpolated learning increases, how is 
it that when similarity is at a maximum (when materials for original and 
interpolated learning are identical) there is not retroactive inhibition , 
but rather retroactive facilitation? We can only put this down as a problem 
requiring further investigation. 

Recognition and recall . Recall is the most commonly xised procedure 
in measuring retention of meaningful prose learning; recognition procedures 
are occasionally used, and releenrning even more rarely. 

Shepard (196?) found that adult subjects are remarkably accurate in 
recognizing sentences that they heid seen as opposed to sentences they had 
not seen. After the subject inspected (at this own rate) 6l2 short sentences 
(on a wide variety of topics), he was presented with 68 pairs of sentences, 
each of which contained one "old" sentence (ftromthe 6l2) and one "new" 
sentence (not in the 6l2) , and asked to indicate which was the "old" sentence. 
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Average percent correct was 89 /?. An almost identical percentage correct 
was attainedty two subjectswho had an inspection series of 1221 + sentences, 
(similar percentages were also found in an experiment involving isolated 
words.) However, it should be noted that the sentences were in general 
quite distinct; probably the recognition score would decrease considerably 
if the inspection sentences were made more similar. (An \anpublished 
experiment recently conducted at the University of Minnesota showed that 
^s have much difficulty deciding whether or not they had. heard a particular 
sentence when they had previously heard isolated fragments of the sentence 
in various combinations.) 

Sachs (1966, 1967a, 196Tb) has used recognition techniques to demonstrate 
that memory for syntactic form decays much more rapidly than memory for 
meaning. Murdock (1963) presented an analysis of the recognition process 
that postulated that recognition depends upon the nvimber of alternatives 
available to the subject. 

Recall of prose materials on either a verbatim or idea basis has been 

studied by a number of investigators, e.g., Gomulicki ( 1956 ), Gofer ( 19 ^ 1 , 

/ 

19^3), Rozov (1959), and Tvilving and Patkau (1962) . These researches show 
that recall depends partly upon the veridical stimuli in the original material 
aa stored in memory , and partly on what Gomulicki calls an "abstractive" or 
constructive process that operates primarily at the time of recall. This 
point has been elaborated on considerably by Bartlett ( 1932 ) and Paul ( 1959 ). 
Posner (1963) feels that even at the time of storage, "only in rare instances 
does S_ store a pure representation of the stimulus; rather he must be viewed 
as an active information handler applying his knowledge of the nature of the 
stimulus and response to reduce his memory load." On the basis of an experiment 
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in the short-term retention of connected discourse, Pompi and Lachman (196?) 
are led to think of meanings as being stored as "surrogate structures," i.e., 
themes, images, schemata, and words. Earhard (I 969 ) attempted to answer the 
q^ue st ionoof whether items are stored as independent units or as interdependent 
units; she interpreted her results, based on retention of word lists, as 
favoring the latter. From all the studies of grammatical factors in recall, 
it seems certain that some grammatical entities are stored at the time of 
initial exposure, although memory for them may be weaker than for the semantic 
elements. Tulving and Patkau's (I962) resvilts show rather clearly that the 
subject stores "adopted chunks" of his own making at the time of original 
learning. 

Gofer ( 1943 ) classified errors in recalls of connected prose as errors 
of* ( 1 ) word order, (2) omissions, (3) added material or intrusions, and 
(1+) substitution. Roughly the same proportions of these errors were found 
in both verbatim and logical recalls. A similar analysis of errors in 
reproductive recalls was made by Rozov (1959) > vho claimed that "the 
substitutions cannot be explained in terms of traces or associations but 
only in terms of the whole process of recall during which the subject can 
choose indiscriminately any words and expression which appear to the Ss 
as similar and equivalent." 

McNulty (1965) , using prose or prose-like materials, attempted to 
determine whether partial learning accoxmts for the customarily-found 
superiority of recognition scores over recall scores; he claimed that it 
does. Lachman and Field (I965) obtained restate which indicate that recognition 
is superior to recall only at early stages of the learning process. 
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Where either recognition or recall techniques are used to investigate 
psycholinguistic phenomena, the same patterns of results emerge, generally. 

For example, Slamecka (1969) obtained the same pattern of results using 
recognition procedures, as did Marks and Miller (196H), who used recall, 
although the parameters were not exactly the same . 

Immediate vs. delayed recall . Parametric data on immediate vs . delayed 
recall for spoken or printed presentations are scarce. In the case of 
listening, Conboy (1955) found that after a 9-day delay, college students 
remembered ( as measured hy a written recall test ) only about half as much 
as they would remember in an immediate recall test , while distortions and 
intrusions were twice as frequent . 

In the case of reading, Thalberg (1967) found that for slow readers, 
more details are remembei’ed in immediate memory, but that in delayed recall 
(2lj hoius) the differences between what is remembered by fast and slow readers 
largely disappear. 

Cohen and Johansson (196T) found that "predictability" or grammatical 
constraint of sentences had an effect on memory tested immediately, but 
none on memory tested 20 hoiors later . 

Marks and Jack (1952) present data on the immediate memory span for 
sentence or sentence-like material as a function of its "order of approximation" 
to English. The figure obtained for "text" was 15. 1 words, but it is not 
specified what kind of text this is. Also, the method of presentation was 
unusual, words being uttered, at the rate of one per second. Baddeley (1966a, 
1966b) studied short-term and long-term memory for word sequences as a function 
of acoustic, semantic, and formal similarity and suggested that short-term 
and long-term memory may use different kinds of coding systems. 



A post-learning question: What was the most important point made in this chapter? 







! Chapter 8 

INDIVIDUAL DIFFERENCES IN LANGUAGE COMPREHENSION AND LEARNING 

The degree to ■which an individual comprehends or learns from 
meaningful discolors e is ,'a function of various characteristics of that 
individual-some relatively stable, others highly changeable. Previous 
chapters have largely ignored such individual differences, as does much 
of the literature in experimental psychology and educational research 
which 'was considered in those chapters. In this chapter we discuss 
individual difference variables and their sources, and methods of 
altering individual characteristics in such a way that improved language 
comprehension and learning 'will resiolt . 

Major Dimensions of Language Comprehension Ability 

Carroll (1968a) has reviewed existing knowledge on the development 
of native language skills beyond the ages of "primary language acquisition," 
with respect to the three major aspects of language (phonology, lexicon, 
and grammar or syntax) and the foiar major types of language skills 
(listening, speaking, reading, and writing). Educators have been quite 
aware that individual differences in vocabulary knowledge and reading 
comprehension are wide and much of the educational program is designed, 
in a very general way, to develop vocabxilary and reading skills to the 
maximum possible for the individml. It has not been equally recognized 
that there may be also large differences in other language skills, e.g., 
in knowledge of grammatical struct\ires, and in listening comprehension 
skill. Althoi;!^ normal children at the first grade have a mastery of 
certain essential grammatical features of their language, their mastery 
of fine details is far from complete. In particular, they have not 
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mastered the very large tody of lexico-grammatlcal knowledge that is 
necessary to understand the sophisticated language of educated ad\ilts. 
This is contrary to the opinions sometimes expressed by writers on the 
subject. Language understanding dei>ends not only on knowledge of the 
conventional features of a langmge system but also upon a large 
accumulation of general knowledge about the world, Its peoples, history, 
etc. As Kelly (1970) puts it, "a massive dictionary- thesaurus-encyclo- 
pedia lies at the heart of human linguistic abilities." 

Vocab\ilary . There are numerous tests, at different levels of 
difficulty, for measuring individual differences in vocabulary (Buros, 
1968), but nearly all of these ai*e normatively scored, and so do not 
explicitly indicate the size of the examinee's vocab\ilary, nor the 
reading (or listening) difflc\ilty level that the Individual with a 
given score co\ild be expected to attain. In spite of the formidable 
methodological and technical difficulties in developing a criterion- 
referenced vocabulary test efforts should be renewed in that direction. 
It is the case, that much of the failure of indlvldiials to understand 
speech or writing beyond an elementary level is due to deficiency in 
vocabulary knowledge. It is not merely the knowledge of single words 
and their meanings that is important, but also the knowledge of the 
multiple meanings of words and their grammatical functions. Berwick 
(1952), Howards (1964), and MacGinltie (1969) are among researchers who 
have been concerned with this problem. . MacGinltie found that deaf 
children are much less flexible than hearing children in dealing w5.th 
alternative meanings of words. 

A number of investigators have tried to compare listening and 
reading vocabxilarles (Ames, 1964; Symonds, I926; Weir, 1951j Armstrong, 
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1953; Kegler, 1959; Seegers and Seashore, 19^9; Yates, 1937; Schultz, 

196O; Burton, 1944; Anderson and Pairhanks, 1937)- Up to about age 
12 or grade 5 or 6, listening vocabulary is greater than reading 
vocabulary; after that time, reading vocabulary catches up with and 
begins to exceed listening vocabulary. At the college level, individual 
differences in listening vocabulary are hl^ly correlated with differences 
in reading vocabulary (Anderson and I\iirbanks, 1937); it should be 
noticed that even at this level there are wide differences in both reading 
and listening vocabularies. Yet, both at the sixth-grade (Roy, 1965) 
and at the college level (Schubert, 1953) vocabulary knowledge does 
not seem to differentiate good and poor readers— apparently there are 
factors other than vocabulary knowledge that are crucial. Burton (1944) 
found that printed vocabulary tests were more revealing than orally- 
administered vocabxalary tests at the 12th grade , however . 

Some of the research just cited may seem in conflict with the 
statement made earlier that deficiencies in vocabulary knowledge account 
for a large part of the variance in reading difficulty. While finrbher 
research is needed to resolve this problem, one might speculate that 
the reading tests on which these conclusions are based do not challenge 
vocabulary knowlege e.dequately, either for "good readers" or for "poor 
readers." As will be seen below, reading comprehension tests meas\ire a 
variety of ^llls, of Tdilch vocabulary knowledge is only one. 

Some efforts liave been made to find meaningful correlates of 
vocabulary knowledge. Blumenfeld (1964) found that a nonverbal pictorial 
reasoning test was a good predictor of futtire achievement in vocabulaiy 
knowledge, but not in reading dclll. Robertson (1967) found that among 
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lOth-graders, certain "verbal fluency tests" share common variance with 
vocabulary tests that measure "breadth of meaning." 

Listening ability . Educators have postulated that individuals 
vary in "listening ability" beyond the mere ability to under stard the 
native language, and a nuunber of tests purporting to measure such an 
ability — ^whether it be simple or complex — have been developed (Brown, 

1955 ). The Sequential Tests of Educational Development published by 
Educational Testing Service include tests of listening ability at four 
school levels covering the range grade 4 to college age. Wright (1957) 
constructed and validated a test of listening ability for grades 2 to 4. 
However, all these tests show substantial correlations with tests of 
intelligence, educational achievement, and other cognitive abilities. 
Spearritt (1962) factor-analyzed a battery of 34 tests of listening, 
reading, and other language skills that had been given to 3 OO 6th- 
graders. He- was able to identify a separate factor of listening ability, 
but it had substantial correlations with other factors of langmge 
knowledge and pc^rformance. Freshley and Anderson ( 1968 ) also made a 
factor analytic study of a listening test, the STEP Listening test 
mentioned above, and found high overlap with subtests of several 
stendai’dized printed intelligence tests. They did find a nmber of 
listening test items that constituted a separate factor, however. 

Bateman, Frandsen, and Dedmon (1964) factor-analyzed one of the subtests 
of the Brown-Carlsen Listening Comprehension Test and found that most 
of the test variance was accounted for by two ftictors ;dilch they 
tentatively interpreted as "listening for details" and "drawing Inferences." 
These factors are quite similar to factors that also appear in the 
analysis of reading tests. 
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A ?:*easonable hypothesis is that a well-constructed listening test 
could measure overall language comprehension ability; while such a 
test would correlate fairly hl^ly with reading comprehension tests 
because of large similarities in content; some part of its variance 
•would remain unique because it would not be subject to variations in 
the specific readlrig skills and habits that are measured by reading 
comprehension tests. Ideally, a listening comprdienslon test, as a 
measure of overall language competence would have separate scores for 
vocabulary knowledge, knowledge of syntactical constructions (or ability 
to follow increasingly Involved constructions), and any other factors 
that would be useful in the assessment of language skill in the 
reception mode. Such factors ml^it include, for example, ability to 
perceive logical organization in discourse material (Khower,^ 19^5; 
Abrams, I966) and ability to perceive speech through noise (Castelnovo, 
Tiedeman, and Skordahl, 19^3; Hanley, 1956). 

Wilkinson (1965) tirges that listening tests be based on realistic 
conversational material, but although such a test would be useful, it 
should be accompanied by a test of ability to understand more formal 
styles of English. It would be desirable, too, to construct a listening 
test in such a way that the scores would assess ability to listen over 
a range of speech rates, both slower and faster than normal. 

Reading comprehension ability . Buros (1968) has made a convenient 
compilation of descriptions and reviews of standard reading tests. Ihe 
measurement of reading comprehension ability is beset with even more 
theoretical confusion than is the case in the measurement of listening 
abilities. Hiose who construct and analyze reading comprehension tests 
have not clearly differentiated the components of langmge skills 
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(vocatulary, grammatical comprehension, decoding skill, inferential 
■behavior, etc.) that need to "be measured. Standardized reading tests 
tap a 2 ?ather heterogeneous set of skills; these skills differ somewhat 
from test to test. Even Davis's (1968) carefiil attempt to isolate 
factors in reading skill is limited in significance "by the fact that 
the items in the teats he analyzed were not constructed to measure 
unique skills; rather, each item depends on several skills. Nevertheless, 
there is evidence in Davis's results for certain identifiable skills 
such as word knowledge, a'bility to handle syntax, a'bility to locate 
detailed information, and ability to make inferences beyond the data 
given. 

Little has been done, since the study by Blommers and Lindquist 
( 19 ^^), to differentiate power of reading con5>rehension and rate of 
reading comprehension. Blommers and Lindquist found, interestingly 
enou^, that there is an important interaction between rate of compre- 
hension and power of comprehension: good readers have hi^ rates on 

easy materia,! but they slow down on difficult items, whereas poor 
readers exhibit approximately constant rates regardless of the difficulty 
of the material. 

A further defect of most reading tests is that they are scored 
normatively rather than with reference to criterion behavior. The 
typical reading test assigns a "reading grade level" to a student on 
the 'basis of his score, but there is seldom aiiy evidence that such 
reading grade levels mean what they purport to because these reading 
grade levels are extrapolated from score distributions obtained at 
given grades. Elley (1970) has described the development of a set of 
true content-referenced tests of reading, but as yet these tests have 

184 



-l80- 

been validated only in New Zealand. The criterion basis for these 
tests is a set of materials graded in terms of difficiilty and cross- 
referenced to various levels of child and adult reading difficulty. 

A start has been made towards the development of an adequate series 
of criterion-referenced reading tests in this country by Bormuth (1966b, 
1967a). Bormuth uses -tlie cloze technique as an overall assessment both 
of reading diffic\ilty of the mateid-al and the individual's ability to 
comprehend it. 

Sources of Individual Differences 

Age. Language competence, including the ability to \jnderstand 
speech, develops continuously and in a rather orderly fashion from a 
very early age. The period from the time of the first utterance up to 
entrance into the first grade is usually thought of as the stage of 
"primary language acquisition"; here essential mastery of the phonological, 
lexical, and grammatical system is attained. There is considerable 
evidence to support the view that language comprehension ability develops 
somewhat in advance of language production ability, but it is difficult 
to trace the development of competence in understanding apart from overt 
use of language. Representative recent stxjdies of the development of 
language comprdiension are those by Bloom (1968), Bogatyrttva (19^7) > 
Flavell (1968), Keeney (1969)# Lovell and Dixon (19^7) t Melian (1968), 
Shipley, Smith, and Gleitman (1968), and Slobin and Welsh (19^8). 

Research observations are generally interpreted as sxiggesting that llie 
child acquires grammar through the meaning system, mther than the other 

way around. 




Development of* listening ability \jndoubtedly extends fiar into 
adolescence and even into adulthood, but obviously it initially develops 
ahead of reading ability. Nesbitt (1969) studied the listening ability 
of first grade children and concluded that on the average they could 
\jnderstand language ordinarily considered to be of second-grade reading 
level; 30^ of the children could \mderstand language of fifth-grade 
reading level. Listening abilities of these children correlated 
significantly with scores on the language sections of the California 
Mental Maturity Test , but not with scores on the nonlanguage sections. 

The STEP Reading and Listening test norms show progressive increases 
throu^out the total range of their ap)plicability (grade 4 to college age); 
a unique feature of these tests is that scores are on a scale that has 
an approximately constant meaning throu^out its range. It is notable 
that variability of scores increases throughoxit this age range; in the 
reading test, for example, the bottom one percent of college freshman 
attain scores that are comparable to those made by the median stuient 
at about grade 6. Norms of certain vocabulary tests show similar trends. 

Some studies have teen made of more detailed aspects of language 
development. For example. Bashaw and Anderson (1968) found that age 
groups from grade 1 to college show progressively better understanding 
of the fine differences in meaning among adverbial modifiers such as 
slightly , somewhat « rather , quite , very , and extremely . Primary grade 
children could distinguish meanings among only 3 groups of these 
modifiers while college-age advilts distinguished meanings among 6 groxips 
along the scale fircan sliriitly to extz*emely . Peel (1966) studied 
development of the capacity to reason about text. According to him. 
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"pupils xip to the age of 13-1/2 years Jiidged circumstantially and only 
by lk+ years did they show a firm tendency to make comprehensive 
Judgments involving the production of possible explanations.” 

Apparently there is a strong maturational component in the develop- 
ment of language understanding; at any given age, however, langmge 
understanding measures correlate highly with other evidences of intellec- 
tual development, ^ere is insufTicient information on the extent to 
which natiirational development can be accelerated by special training; 
most research on the training of language abilities shews that training 
efforts tend to widen individual differences rather than narrow them. 
Ibitil we know more about the extent to which language understanding 
abilities can be modified by training, we sho\ild ncft expect average 
children to understand language far beyond their listening or reading 
ability levels. 

Sex . In the Ihiited States, it is a rather tmiversal finding that 
on the average girls do better than boys on reading tests. Evidence is 
now accumulating that the opposite is the case for listening tests 
(Brlmer, 1969; Nesbitt, 1969 ). Briaer (I 969 ) theorizes that in boys, 
develoi«nent in syntactic control on the production side is delayed; 
thus, boys have more pressure to learn to listen, and they do so. Sex 
differences are also found in performance on verbal learning stvjdies; 
in King’s (1959) ftudy of retroactive inference with the Miller-Selfridge 
"order of approximation” materials, girls learned more. 

Socioeconomic status . In research stidies, the term "socioeconomic 
status” covers a multitude of variables — parents* income, parents’ 
occupation, ethnicity, and even bilingualian. Some lower socioeconomic 
groups are characterized by learning to speak some nonstandard variety of 
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Engli^. If understanding of standard English is taken as the dependent 
variable, the usual finding is that low SES groups do not do as veil 
as '’middle-class'* children who have learned a more standard variety of 
English (Chappell, 1968; Garvey and Mclhrlane, 1970 ; Osser, Wang, and 
Zaid, 1969) . Garvey and Mclbtrlane mention that both race and social 
class were Important determinants of performance on a sentence repetition 
task, and Osser, Wbng, and Zald remark that the performance of the 
middle-class white children was sxiperior to that of their sample of 
lower-class Negro children even when differences between their dialect 
and standard English are taken into consideration. The interpretation 
of these findings is extremely difficult, and cex*tainly not all the 
data needed for such interpretation are in hand, because of the frequent 
confounding of race, social class, and dialect differences In these 
studies. It seems fairly clear that low socioeconanlc status is 
associated with slower language development, with ethnicity and dialect 
as ccmiplicatlng factors. 

Data accumulated by Barrltt (1969) ancl Barrltt, Semmel, and Weener 
(1967) suggest that socioeccnomlc grovpe may not differ in basic auditory 
memory abilities, but that they ^ differ when standard language patterns 
are involved in memory performances. 

Langmge performance differences connects with SES differences 
persist and prrobably increase up to adulthood. Gentile ( 1968 ) found that 
low SES groups profited little from special instruction in word definitions 
when attempting to solve verbal analogies items. On the assumption that 
low SES groups — specifically, low SES Negroes — ^would have more educational 
deficits In reading than in listening, Orr and Gmham (1968) and Carrwer 
(1969) designed a listening comprehension test which would be especially 
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suited to the dialect, interests and backgrounds of these groups. However, 
the low-income Negroes showed a deficit on this test comparable to that 
shown by other standardized measures of aptitude and listening comprehension. 
Thus, it does not seem to be the case that disadvantagement of this group 
is specific to reading; it also extends to language in general. This raises 
the question of how such disadvantagement can be alleviated. Insufficient 
data are available to answer this question since efforts to study it have 
a short history. 

Filep (1967) obtained some indications that "nonverbal, sound, branching 
treatments" were parbicxilarly appropriate for teaching low IQ, nonwhite, 
low SES children. 

" Intelligence" and cognitive abilities in general . Since "intelligence 
is usually measured with instruments that involve much use of language, 
it is almost tautologous to claim that language development is related to 
intelligence. For example, the original Binet scale (Binet and Simon, I908) 
included sentence memory and vocabulary tests as indices of intellectual 
development. To a large extent, intellectual development is the same as 
language development. One cannot deny, however, that there are wide indi- 
vidual differences in language and intellec-^ual development even among groups 
that have apparently similsir learning experiences. We cannot enter into a 
discussion here of the difficult problems of determining the relative 
contributions of genetic and environmental factors to these differences. 

On the other hand, it shoxad be noted that many varieties of cognitive 
abilities are distinguishable, and only some of them are closely associated 
with Ismguage development . 
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Relatively few studies of the language development of mentally retarded 
children are available. Semmel, Barritt, Bennett, and Perfetti (196?) found 
significant differences between mental retardates and normal children on 
a modified cloze test, but these significant differences did not always 
favor normal children when matching was on mental age. 

Special handicaps; Blindness, deafness . Hartlage (1963) found no 
significant differences between mean listening comprehension scores of* blind 
and sighted students when matched for age, sex and intelligence. Nolan 
(1962, 1963) has presented a discussion of reading and listening by the 

blind. 

Odom and Blanton (1967) demonstrated that in phrase -learning tasks, 
deaf children are not able to take advantage of language structure in the 
same way that hearing subjects do. Rush (1966) described a program whereby 
substantial success was attained in teaching deaf children syntactical 
patterns through programmed instruction employing visual memory. 

Personality variables . There has been considerable interest in personality 
variables possibly involved in the remembering of connected discourse. Patil 
(1959) studied the personality correlates of the tendency to intrude ccn- 
fabilatory material into story reproductions, and a personality factor of 
this type was also noted by McKenna (1968) in a factor -analytic study of 
college students' story reproductions. McKenna did not, however, find 
distinct factors for rote vs. meaningful learning. 

Alpert (1955) was unsuccessful in finding any relationship between 
measures of empathy and reading comprehension of literary and nonlit_rai*y 
measures; if anything, the relationship was negative. 

Tobias (1969) found that high-creativity groups learned more than low- 
creativity groups in programmed instniction on tech.iical subjects. 
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A study by Neal (196T) found significant relations between certain 
personality variables and reading performance in a college-age group. 

Studies by Rxinkel ( 1956 ) and Salzinger, Hammer, Portnoy, and Polgar 
(1970) suggest that the success and accuracy of communication between 
people is partly a fimction of the extent to which their personality 
characteristics are similar, and partly a matter of how well they know one 
another. Maclay and Newman (I96O) obtained results indicating that the 
willingness of an individml to communicate end be understood is inversely 
correlated with authoritarian attitudes. Possibly such attitudes would 
operate in determining the willingness of a listener to attend to the details 
of a message. 

In view of the paucity of research on personality variables and language 
comprehension, this should be a promising field for investigation. 

Motivation , Attitude , and Set 

Under this heading we consider a series of variables that are important 
in determining whether a student who is otherwise capable of comprehension 
and learning will in fact be ready and willing to do so. Reviews of research 
in motivational variables as they apply in learning from educational media 
have been prepared by DiVesta (I96I) , Ugelow (I962), and May (1965a). 

Types of motives . Berlyne (1965) emphasizes the necessity of postu- 
lating some kind of "arousal" mechanism whereby motives such as curiosity 
are called into play in the process of learning and thinking . Jones , Wilkinson , 
and Braden (1961) showed that if individuals are deprived of information, 
they are more likely to seek it. Rosen, Siegelman, and Teeter (I963) studied 
individual differences in preference for "widely known" vs. "unknown" 
information. They found that the majority of college students, particularly 
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high verhal aptitude students, say they prefer new and "unknown" information. 
Students who said they preferred to learn "widely known" information tended 
to he other-directed and socially extraverted. Thus, "curiosity" may he 
thought of as an individual difference variable that may affect the indi- 
vidual's readiness to learn from meaningful verhal discourse. As McLaughlin 
( 1965 ) pointed out, this is usually an uncontrolled variable in studies of 
incidental learning, so that it is difficult to draw any rigorous distinction 
between "intentional" and "incidental" learning. Salomon and Sieber (1970) 
showed organized and unorganized films to ^s under two types of instructions. 
To note information, and to form hypotheses about the topics dealt with in 
the films. They stated that organized films were more effective in arousing 
the kind of curiosity that allowed noting information, while unorganized 
films were more effective in prompting individuals to formulate hypotheses. 

Achievement motivation, or "n Ach" as it is often abbreviated, refers 
to a generalized motive to attain success. Weiner (1967a) reviews current 
resesrch in achievement motivation as it applies to school learning. This 
researc]^ suggests that individuals differ widely in both motivation to attain 
success and motivation to avoid failure, these being somewhat independent 
motives. Reconsideration of some aspects of J. Atkinson's model of the role 
of these motives in learning leads Weiner to think that learning situations 
challenge these motives best when the questions are neither too easy nor too 
hard, but are likely to be correctly answered about half the time. It should 
be noted that this suggestion conflicts with the principle of low error rates 
that often guides the construction of "programmed instruction learning 
sequences. This latter principle is based on the assumption that the student 
will learn best when he is consistently rewarded; however, in the previous 
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chapter it was pointed out that current research on progi'ainmed instruction 
casts doubt on this assumption. 

In another report, Weiner (196Tb) concludes that motivation, not 
rehearsal, can itself account for the placement of items in short-term memory. 

General achievement-^notivation appears to interact with anxiety in some 

complex way. Russell (1952) failed to find any effect of experimentally- 

induced success-motivation on recall in a serial learning task, but he did 

find that anxiety, experimentally induced by telling Ss they were failing, 

had certain small effects. Kight and Sassenrath (1966) fo\ind that high 

achievement-motivated Ss performed better in a programmed instruction learn- 
ing task. High-anxiety students worked faster and made fewer errors in 

learning than low-anxiety students , but they failed to exhibit higher retention 
scores. MacPherson (196?) also found that high-anxiety students took less 

time to complete a programmed course; this relationship between anxiety and 
time-to-complete was more pronounced for low IQ students. O’Neil, Spielberger, 
and Hansen (1969) found that anxiety, as measured by an inventory and also 
by blood-pressure measurements, increased. as students were exposed to difficult 
materials and decreased with easy materials. Using Werner and Kaplan's ( 1950 ) 
context-learning task, Schmeidler, Ginsberg, Bruel, and Lukomnik (I965) also 
found complex interrelationships between anxiety, achievement motivation, and 

success in learning. 

Levonian (196?) found that in the presentation of a film about safety, 
scenes which elicited high arousal and anxiety were recalled poorly on 
initial testing, but significantly better one week later. Low-arousal scenes, 
however, had precisely the opposite effect. Uhlmann (I962) found that re- 
tention of materials in meaningful verbal discourse was a function not only 
of their anxiety-arousing properties but elso of certain "cognitive style ' 
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characteristics of the learners, specifically their ability to "differentiate” 
stimuli as measured by the Embedded Figures and the Stroop Color-Word Tests. 
Schwartz (1967) investigated the differential properties of certain types of 
films in arousing "effectance motivation," defined as motivation to interact 
effectively with the environment (as opposed +.0 lack of confidence in one's 
competence to do so) . 

These researches, despite their heterogeneity, are mentioned for their 
possible implications for future research on the role of motivation in learn- 
ing from verbal discoiirse. 

Attention . Attention, a state of heightened sensitivity to psirticular 
stimuli or sources of stimuli. Is presumably a consec^uence of motivation, 
but it can be studied as an independent phenomenon. Wachtel (I96T) has 
contributed a highly theoretical treatment of conceptions of broad and 
narrow attention. In a more practiced, vein, Fessenden (1955) speculated that 
listening may occur at seven levels of attention: (l) Isolation of sounds, 

words, etc. vith no evaluation, (2) identification of meanings of sounds, 
words, etc.; (3) integration of perceptions with psist experience; (U) in- 
spection of the novel sispects of stimulation and the beginning of evaluation; 

(5) interpretation; (6) inta!rpolation of one's own comments and reactions; 

j 

(7) introspection as to the effect of the message on oneself. Whether it 
would actually be possible to identify such levels- in some objective way is 
not indicated by Fessenden. 

Muscle-tension d|uring "attentive" listening was studied by Wallerstein 
(195M throTogh the use of electromyography. Muscle tension increased during 
the first hearing of a sequence from a detective story and even more so 
during the first hearing of a difficult philosophicEil passage from Kant. 
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By the third hearing when attention was presumably decrensed, ausele 
tension also decreased j some subjects even went to sleep. 

Bakan (1952) set up a "vigilance” condition whereby Ss had tc listen 
for 90 minutes to apparently random digits in order to detect all instances 
of sequences of three odd digits. During a given 90-minute session, 
efficiency in this task tended to decrease; however, a slight practice effect 
was observed over the four days of the experiment. 

Lumsdaine and May ( 1965 ) have reviewed various methods for measuring 
degree of attention during the watching of films. As far as the writer is 
aware, there are no studies of "attention" during reading, although there are 
obvious variations in attention diiring reading. Such variations can be con- 
trolled to some extent by instructions , as was pointed out in the previous 
chanter. We may mention here an interesting study by F. Thylor (1966), who 
had his subjects read a passage pertaining to the operation of a piece of 
psychological apparatus (a "dotting machine"). Some were told they merely 
had to pass a test on the operation of the machine , others were told they were 
going to have to operate the machine , and still others were told nothing 
about the purpose of their reading. All were then given both the written 
test and a performance test of operating the machine. Those told they were 
to take a test did well on the test but poorly on the machine; those told they 
vould operate the machine did poorly on the test but well on the machine, 
those told nothing did poorly on both tests . Apparently the instructions 
determined what the subjects would pay attention to. The result for the 
group given no particular instructions seems to conflict with ’‘crk on the 
"mathemagenic hypothesis" cited in the previous chapter, where it was noted 
that subjects not alerted to the kinds of tests they would perfpnn tended 
to pay more attention to all aspects of a passage . ; 
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The so-called Von Restorff effect is sometimes cited as evidence that 
’’isolation" of a unique item in a series causes to pay closer attention to 
it and hence to recall it better. Green (1958) showed that the Von Restorff 
effe<’t is due not to "isolation" but to change ; i.e., whenever a new type 
of stimulxis appears after a series of stimuli of another type, the first 
such stimulus is noticed and recalled better. On the strength of this finding, 
it may be possible to accentuate important stimuli in a aeries (which could 
be a series of sentences or other meaningful presentations) by maXing them 
the first of a series of stimtili of similar types . 

It has been difficult to measure and control attention in classroom 
situations. Hudgins (196?) found it well-nigji impossible to detect from 
any observable behavior the actual amount of attention that children are 
giving to learning. He confirmed, however, the common assumption that 
children's attention in a recitation situation can best be maintained when 
the children are called on in random (rather than predictable) order and 
after (rather than before) the posing of a question (Hudgins and Gore, 1966 ). 

Ginsburg (1967)* working with Pieigetian tasks, showed that there is an 
increase with age in the amount of information attended to in a display, 
and that the more specificsJ.ly and efficiently a problem is posed to the 
subject, the more likely he will respond at his maximum level of attention 
and ability. 

Other studies of set and attention that may be found relevant are those 
by Broadbent (1952a, 1952b, 1956, 1953), Reid and Travers (I 968 ) , and 
Talland (1958). 

Study habits and attitudes . Stone (I 965 ) conducted a study based on the 
hypothesis that study habits would affect students’ performance in learning 
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from reading a text in the usual way, whereas they woild not affect stud.nts' 
performance in learning from programmed instruction . He was not a’ole to 
confirm this hypothesis; study habits were unrelated to performance in 
either type of instruction. 

Two studies suggest that students' reading comprehension is affected 
by their attitudes towards the subject-matter (Jacobson and Johnson, 1967; 
Aaron and White , 1968) . 

Teaching Fetter Language Comprehension 

Language comprehension is such a manifold and heterogeneous process, 
as has been demonstrated in this review, that attempts to "teach" listening 
and reading comprehension might be expected to have only indifferent success. 
Language development itself is such a slow and long-drawn process, particu- 
larly through the school years, that it is difficult to believe that special 
teaching programs of relatively short duration, say , a few weeks, could 
produce large gains. For example, to the extent that language comprehension 
depends upon a lcu*ge vocabulary, brief program? of language improvement are 
unlikely to have substantial effects , because the rate at which new vocabulary 
can be taight and acquired are limited. Language improvement programs have 
been based on the assumption that significant effects can be produced by 
teaching jiarticular skills , such as habits of listening attentively to 
perceive rtructural organization in speeches, that can be acquired in a 
relatively short time and that will make a difference of quantum magnitude 
in comprehension behavior . 

Teach ing listening comprehension . Several commercial programs for 
teaching listening skills are available , but research evidence supporting 
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their worth and effectiveness is scanty. Hollingsworth (1965) found, in a 
controlled experiment, no significant effects produced hy the use, over a 
ten-week perioa, of the 30 tapes of the "Listen and Read" program of the 
Educational Development Laboratories The dependent -variable in this study 
was the score on the Listening test of the Sequential Tests of Educational 
Progress (STEP); one might question its appropriateness for measuring the 
skills presumably taught by the Listen and Read program. 

A similar negative finding was reported by Gustafson and Shoemaker 
(1968) for another commercial program. Effective Listening . In one of their 
experiments , conducted -with small numbers of adult subjects , the commercial 
program (taking 2 1/2 hotirs) yielded significantly larger gains than a 15- 
minute presentation of a printed summary of the points made in the program. 

But in another experiment, even though the program proved better than a tape- 
recorded summary and better than no treatment at all, the result was obtained 
only from "sanctioned test items itirnished by the vendor," and not for other 
items, of a similar nature constructed or selected by the investigators. The 
investigators consider that their findings cast doubt on the generality of 
the skills taught by the program. 

Studies which have focused on particiHar skills and made use of train- 
ing programs specially devised by the investigators have met with greater 
success. DeSousa and Cowles (1967, 1968) found significant effects, as 
measured by the STEP Listening test, of a 20-day program of training in 
"purposive listening- given to 7th-graders . The gains appeared both on an 
immediate posttest and on a test given one year after the training. Lundsteen 
(1970) obtained positive results from a training program that emphasized 
certain "critical thinking" skills. One of the experimental groups of 5th- 
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graders received "training with a **liS"tening emphasis** and its superior i"ty 
showed up in cer"taih specialized listening tests, hu"t no"t on a test of 
general listening ability. 

Other studies of the effects of listening training programs are summarized 
and annotated hy Duker (1968) . It would seem worthwhile to mak'e analyses of 
precisely what listening skills ‘seem to he teachable , with careful attention 
to the measurement of specific compori'ents of skill. Total scores, and even 
some subtest scores, on available listening tests do not seem to be homo- 
geneous enough to permit one to draw iprecise conclusions about what skills 
are being measured, or what skills afe improved (if any) , in particular 
training programs . 

Keislar and Stern *s (1969) rese&rch narrowed its attention to the 
teaching of the aural ccmprehension , in first-graders, of certain linguistic 
units such as quantifiers ( some , all , none ) and expressions of negation, 
conjunction, disjunction, and joint denial. In comparison with control groups, 
their experimental groups made clear gains. They also investigated whether 
requiring the child to respond aloud in certain comprehension tasks would 
enhance the effect of the teaching program; the outcomes were positive for 
some concepts and neutral or even sighificahtly negative for others. Inter- 
pretation of this result would require further research. 

Teaching of reading comprehensioh i In this section we are not concerned 
with the large quantity of research on teaching "decoding** skills, i.e. , 
teaching children to convert print iri"feo something corresponding to its oral 
representation, but rather with research having to do with the teaching of 
the comprehension of the message ohce it has beeri read. Seen in this light, 

the teaching of reading comprehension Has many, of the same problems that are 

/ 

inherent in the teaching of listening comprehehsibn. The reading task does, 
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however, have the added dimension of speed, and many programs of reading 
improvement emphasize speed of comprehension, or even speed alone. 

Research on the teaching of reading comprehension has "been summarized 
and commented on in many places, e.g., by David Russell and Leo Fay in 
their chapter in the Handbook of Research on Teaching (Gage, 1963) . Our 
consideration here will focus on several researches bhat illustrate specific 
problems . 

One of the most perceptive essays on the teaching of reading compre — 
hension is that by Black (l 95 M . Black constructed a test of reading compre- 
hension for students at a "training college" in England and analyzed the kinds 
of errors made on the test. The materials of the test were taken from 
general reading (essays, newspaper editorials, fiction, nonfiction) that 
an educated adult should be able to read. Black quotes some of I. A. Richards’ 
rather pessimistic conclusions concerning the ability of adults to read 
such material with understanding and insight; although he is not as pessimistic 
as Richards, his results do show considerable deficiences in understanding 
among "training college" students who would be comparable to undergraduate 
teacher trainees in this couatx^y. Errors are classified into the following 
categories : 

• ♦ 't ' 

Failures to understand a writer's intention 
Failures to detect irony 

Ignorance or misunderstanding of difficult words 
Ignorance or misunderstanding of difficult allusions 
Not understanding illustrative examples or metaphors 
Errors due to students ' inadequate background information 
Failures to see hew the context influences meaning 
Errors due to readers' preconceptions 

With the possible exception of those due to ignorance or misunderstanding of 
"difficult words," these errors cannot be put down to lack of understanding of 
language as such. They seem to be due mainly to deficiencies in the student's 
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general educational background, deficiencies that can be made up only by 
wide reading and broader education. Programs designed to teach reading 
comprehension" at the college level are attempting to do something that is 
well-nigh imTjossible in the time available to them — to give the student a 
general education. Although the student may be helped over his difficulties 
by some hints and special coaching and even some specific information about 
allusions, hard words, and unusual examples as they come up in reading, it 
is unrealistic to expect "remedial" reading programs at the college level 
to "make over a student's mind." This is perhaps the reason why these 

I 

programs seem to have had such limited success . 

It is reasonable to thinlc that at lower age levels a good deal can 
be done with specific training in vocabulary, grammatical analysis, and 
the teaching of concepts. Lieberman (1967) obtained significant gains on 
the lova Reading Test and a special vocabulary test adapted from those used 
on certain intelligence tests, through a program designed to teach vocabulary 
concepts "emphasizing auditory, visual, and tactile experience." Similarly, 
Jacobson, Yarborough, and Hanbury (1968) had "encotir aging" results with a 
year-long program of vocabulary study designed to improve reading, writing, 
and listening skills and verbal abilities in general, at the high-school level. 

Allen (196U) recommends a program of training that makes use of his 
"sector analysis" grammar to help elementary school children analyze and 
comprehend sentences more adequately. No research seems to have been reported 
concerning the effectiveness of such a program. Reed (1966) developed a 
program of reading instruction for grade 7 based on recognition of sentence 
elements and paragraph structTire. In a controlled experiment she found that 
the program yielded gains in experimental groups over those of control groups. 
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but her results show that the gains were made principally by bilingual 
children, very little by monolingual children. 

Reading improvement programs have most often been designed to increase 
pupils' reading rate. The ass\xmption seems to have been that improved 
comprehension will result in some magical way from improved reading rate. 

There has been much misunderstanding concerning the relation between rate 
and comprehension (Blommers and Lindquist, 19 ^^. From the fact that measures 
of rate and of comprehension are often found to have substantial' inter- 



correlations , it does not follow that improvement in rate will produce 

improvement in comprehension. This issue has been discussed perceptively 

* 

by Harris (1968) , who states that research has generally not shown gains in 
comprehension as a result of reading-speed improvement programs. Students 
who appear to attain high reading speeds in commerical reading programs 
seldom if ever show comparable improvement in comprehension; comprehension 
is often less than 50 ^ of that at slower speeds. Berger (1967) found no 
significant improvement in comprehension in any of his college-freshman 
reading-improvement groups. He found that rate increases occurred in all four 
of his groups “Whether taught by a tachistoscopic method, a controlled reader 
method, controlled pacing, or simply practice in paperback scanning, and that 
these rate increases held up after 8 weeks. He pointed out that greater 
gains in rate were obtainable by a simple method — paperback scanning — than 
by the other methods he investigated, each requiring the use of special 
expensive equipment . 

The finding that comprehension does not improve along with improvements 
in reading rate and flexibility might have been expected in view of the fact 
tha''' improving comprehension would entail attention to the language difficulties 




in material and to the logical and inferential behavior that is involved 
in high levels of comprehension. We can make reference again to the study 
of Lundsteen (l9T0) , who foiuid that training in critical reasoning produced 
gains in comprehension scores on reading tests given to 5th-grade children. 
The experiment of Bridges ( 19 U 1 ) with pupils at the Uth, 5th, and 6th grades 
may also be cited as showing that gains in comprehension accrue when special 
efforts to teach comprehension are made. Bridges found, in fact, that 
training that emphasized comprehension rather than speed was more effective 
in developing both speed and comprehension than was training that emphasized 
speed and minimized comprehension. In the light of some of the research 
cited in Chapter T, Bridges' methods of teaching comprehension may not have 
been optimal. She used daily comprehension exercises that presented pupils 
with questions be fore the reading selections; the children were to "read 
to find the answers" and were then permitted to check their answers. Accord- 
ing to the work summarized by Frase (l9T0a) , more effective reading habits 
might have been engendered by putting the questions after; the reading 
selections. An issue left open by Frase 's research, however, is that of 
whether permitting students to re-read the material to check their answers 

world have increased comprehension even further. 

It may be suggested that in the planning of research, the salient need 
is to determine exactly what practices in the teaching of comprehension will 
make this teaching optimally effective. Additional studies of the overall 
effectiveness of ill-defined programs will be of little value. This remark 
applies to the teaching of comprehension generally, both in the listening 
and reading areas . 
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